Big Data Analytics with Hadoop


Rijul_Sahu

November, 2018


Competitor Product Change Dashboards.

Implemented competitor product change dashboards using Hadoop-based big data solutions for a leading UK-based electronic components manufacturer and distributor operating across 32 countries. The solution addressed challenges in managing massive monthly data volumes (6+ billion records), delivering actionable competitive intelligence for pricing and inventory strategies at a global scale.




Description

The client sought to efficiently analyze competitors pricing trends to support new and existing product lines.

They faced:

  • Large-scale data processing challenges due to high data volume, velocity, and variety.

  • Manual, time-intensive data handling.

  • Limitations in their existing BI architecture, which hindered timely and holistic reporting on global competitor KPIs (price changes, inventory, trends, and matched articles).


Key Challenges

  • Manual and slow data processing.

  • Difficulty in executing competitive analysis with massive, diverse data sets.

  • Inability to quickly analyze historical trends involving price and product category changes.

  • Complications in meeting critical deadlines (ETAs).


Solution Highlights

  • Designed and deployed a 6-node Hadoop cluster (20 cores, 150GB RAM, 2TB storage) for scalable data ingestion and processing.

  • Transitioned data storage to optimized Hive tables in row-columnar format, enabling efficient retrieval and analysis.

  • Developed business logic and ETL workflows using Apache Pig to integrate and automate data processing across multiple systems.

  • Automated end-to-end workflows, from raw data extraction through to interactive reporting.

  • Collaborated with BI teams to implement Tableau dashboards driven by real-time Hive-aggregated datasets, offering actionable, ad-hoc insights.


Results & Impact

  • Reduced dashboard and data processing turnaround time by 83%.

  • Deployed a scalable Pricing Data Warehouse managing 12+ million inventory items and 43 competitors globally.

  • Enabled analysis of 6 months of competitor data (~6 billion records) at an inventory level - previously unmanageable with traditional RDBMS.

  • Decreased report generation lead time from 3 weeks to just 1 day.

  • Batch processing now supports all global markets (including DE, FR, GB, IT, JP, CN) simultaneously.

  • Automated workflows saved approximately 3 weeks of manual effort monthly.

  • Improved process efficiency by 99% and eliminated dependence on manual file processing.

  • Achieved $150K annual savings on data processing, storage, and analytics.

  • Freed up 2 FTEs [full time employees] through automation.