Clean, well-structured data is one of the biggest barriers to efficiency and cost-reduction programs according to supply chain and logistics orgs. And while it is still far from perfect, cleaning data is getting easier thanks to machine learning. Tools like Adiona's Diagnostics APIs help to improve the traditional Extract-Transform-Load (ETL) process by automatically detecting and sorting data from complex systems like ERPs and TMS'. Traditionally, ETL is a highly manual process dominated by tools like Excel. It can take dozens or hundreds or hours for a skilled employee to work it out how to port a dataset from one tool to another just to test it. And if the testing is unsuccessful, that feels like wasted time. If the testing is successful, there is still so much more work to be done when moving towards an actual integration, for example between an ERP and a solution API like ours. Let's take an example. This spreadsheet shows a sample of data directly from an ERP system. Can you tell what all of the fields mean?
Some of them are a bit obvious to a human, but they wouldn't be obvious to an ETL script. You'd need to manually select the field and map it to the corresponding field in the tool you want to use. For example, "Delv_Qty" here would be "Quantity" in Adiona's system. Then you'd also need to ensure the units of quantity match, for instance maybe the source field has two decimal points but the target field needs to be an integer.
Also, some fields are NOT obvious even to a human! What is the difference between "Tkt_Location_Lat" and "Delv_Lat"? Which latitude value should I use to specify my delivery location in Adiona and then how do I get it back into the correct field of the ERP system? A human would need to investigate the context of the data to decide. And there are dozens of fields here that need to be mapped.
This is where Machine Learning can help. What is machine learning (ML)? In this context, it's a way to use existing data mapping as a training guide and mimic a humans ability to decide things from context. If you give a well-designed ML model enough examples (millions and millions), it can learn how to judge the context of data in a similar way to a human. And of course, much faster!
How does it do this? There are 3 basic steps in the diagram below.
1) Analysis - this is there the model parses the data
2) Label - this is where the model uses a variety of algorithms to 'label' each field with the correct data type
3) Normalize - once the data is labeled correctly, a different set of algorithms can transform each field into the format required for the solution system.
Look closely at the diagram. You can see how values like addresses and dates are parsed, labeled, and then transformed into a uniform format so they can be ingested by our system. Notice how the unnecessary data in the address fields is also removed wherever possible.
This technique not only speeds up the demonstration and ROI calculation process for using Adiona's system, but partially automates the integration process. We intend to release it as a standalone product suite in the future that can be used for other solution integrations or data science investigations.
Here for our Black Friday Sale? Click the banner below and register to receive 15% off the first three months of any Adiona subscription if you sign on before 31st December 2022. Terms and Conditions apply. Offer not valid for existing customers.