Big Data Analytics with Hadoop
Rijul_Sahu
November, 2018
Competitor Product Change Dashboards.
Implemented competitor product change dashboards using Hadoop-based big data solutions for a leading UK-based electronic components manufacturer and distributor operating across 32 countries. The solution addressed challenges in managing massive monthly data volumes (6+ billion records), delivering actionable competitive intelligence for pricing and inventory strategies at a global scale.

Description
The client sought to efficiently analyze competitors pricing trends to support new and existing product lines.
They faced:
-
Large-scale data processing challenges due to high data volume, velocity, and variety.
-
Manual, time-intensive data handling.
-
Limitations in their existing BI architecture, which hindered timely and holistic reporting on global competitor KPIs (price changes, inventory, trends, and matched articles).
Key Challenges
-
Manual and slow data processing.
-
Difficulty in executing competitive analysis with massive, diverse data sets.
-
Inability to quickly analyze historical trends involving price and product category changes.
-
Complications in meeting critical deadlines (ETAs).
Solution Highlights
-
Designed and deployed a 6-node Hadoop cluster (20 cores, 150GB RAM, 2TB storage) for scalable data ingestion and processing.
-
Transitioned data storage to optimized Hive tables in row-columnar format, enabling efficient retrieval and analysis.
-
Developed business logic and ETL workflows using Apache Pig to integrate and automate data processing across multiple systems.
-
Automated end-to-end workflows, from raw data extraction through to interactive reporting.
-
Collaborated with BI teams to implement Tableau dashboards driven by real-time Hive-aggregated datasets, offering actionable, ad-hoc insights.
Results & Impact
-
Reduced dashboard and data processing turnaround time by 83%.
-
Deployed a scalable Pricing Data Warehouse managing 12+ million inventory items and 43 competitors globally.
-
Enabled analysis of 6 months of competitor data (~6 billion records) at an inventory level - previously unmanageable with traditional RDBMS.
-
Decreased report generation lead time from 3 weeks to just 1 day.
-
Batch processing now supports all global markets (including DE, FR, GB, IT, JP, CN) simultaneously.
-
Automated workflows saved approximately 3 weeks of manual effort monthly.
-
Improved process efficiency by 99% and eliminated dependence on manual file processing.
-
Achieved $150K annual savings on data processing, storage, and analytics.
-
Freed up 2 FTEs [full time employees] through automation.