Data Integration using Python & Alteryx
Rijul_Sahu
January, 2020
Digital process wherein enterprise reporting solution needs to be implemented on large datasets for descriptive analytics.

Description
The client wanted an automated reporting solution, which has capability to serve multiple business units enabling effective and time-bound decision making and covers scope of Adidas and Reebok global business accounts.
The goal of the project was to replace the existing Syntasa application clickstream data source with Adobe API using in-house toolsets.
Challenges
The key challenges faced in this process were:
- High processing costs.
- High license costs.
- Security concerns.
- Lack of transparency on functionality.
- Dependency on vendor.
Solution
I designed and developed a solution using Python and Alteryx, which consists of 3 steps:
- Data pull using Adobe API and the code is written in Python.
- Data transformation and application of business rules written in Alteryx workflows.
- Push analytics-ready data set into Exasol in-memory database for enterprise wide reporting and analysis.
All these steps happen in a tightly co-ordinated and scheduled Alteryx wrapper workflow.
I also integrated an ingeniously developed ‘Slack bot’ into the production system, which sends daily job notifications to Adidas’ Global Dev Ops team.
Benefits
Efforts in designing and implementing this solution led to:
- In-house knowledge development – no dependency on vendor.
- Eliminated ongoing delivery costs on Syntasa application.
- Merged multiple business unit’s data like football Adidas, Yeezy Adidas, et cetera into a single data repository for multiple stakeholders enabling multi-level analysis.
- Removed manual intervention – error-free delivery.
- Automated data quality by adding validation checks in Alteryx workflow.