Currently, we experience the technological advances present in the industries, directly impacting the processes of construction, development, and delivery of products to the consumer. Market competitiveness increasingly focuses on technological and digital pillars, thus making the automation and digitization of processes more recurrent in companies. The impact generated by Industry 4.0 has created a horizon of opportunities for the retail market to get ahead and compete at higher levels with competitors, seeking higher rankings in terms of delivery, quality, efficiency, and effectiveness in the processes until the product reaches the consumer.
AI in retail
In addition, advances in Artificial Intelligence, Machine Learning, and IoT (Internet of Things) provide new horizons for the various branches of the retail industry. The automation of storage processes, route monitoring, material storage strategies, demand forecasting, and customer satisfaction are examples of procedures adopted through these technologies to obtain better results in the market.
We are living in the data age. Being prepared for it, and orienting internal processes to data, will enable companies to immerse themselves in this ocean of opportunities, thus resulting in cost reductions based on analysis of losses and waste, and in more sustainability, competitiveness, and approval in the market.
Supply chain and S&OP
Among the different operational areas of the industry, artificial intelligence stands out strongly in the supply chain, promoting more automation in production processes. Tracing a procedural path of operations with AI, it is clear that the cycle ranges from the implementation of intelligent technologies in sales and operations (S&OP) processes, thus intensifying the analysis for better sales strategies with the help of the marketing area, to better formats to operate and the automation of exhausting and repetitive jobs.
Through the advancement of the Internet of Things, it becomes more efficient to capture data from different stages of production. Obtaining data from the first task to delivery to the final consumer is no longer a problem, with the possibility of extracting production data, for example, robots implemented for storage, applications for drivers, and products connected to the internet, among other ways of collecting data with IoT. Another important point is the bridge between stakeholders through advanced data analysis with Machine Learning and AI, aiming to filter raw material suppliers and final suppliers that are more aligned with the company’s interests and also to obtain a lower loss in the processes. procurement of materials and delivery to the consumer.
AI, ML, and IoT technologies also influence revenue generation, increasing profits and better results with customer and supplier relationship management. An example of this is intelligent dynamic pricing, which uses artificial intelligence and adopts market and consumer-based strategies to determine the best price (not necessarily the highest, but the most appropriate price to compete in the market), aiming at increasing revenue.
Demand Forecasting
Regarding demand forecasting processes, the implementation of AI and ML produce an assertiveness of around 90%, generating impact and improvements in demand forecasts based on advanced analysis of different data, such as weather conditions, the economic situation of the market, available quantities, consumer desire, and consumption predictability. In addition, advanced analytics and intelligent models that have continuous learning through greater data collection and time provide predictive actions in real-time, helping decisions in a way assisted by professionals. This reduces the failures and risks in operations with decision-making and can change them in case of negative predictions that can generate several impacts.
Furthermore, in the area of dairy and perishable products, AI has great strength, as strategies for goods with short dates and more fragile logistics need to be much sharper. This contribution is supported by collecting data, and information and creating predictive demand models that deliver better strategies for storing products, defining the best routes, reducing fuel waste, and forecasting geolocation in the case of products with greater demand. so as not to keep them in distant stocks, thus facilitating the preservation of the products until their final delivery.
Big Data
This is a term that has been gaining great proportion and space in the context of industry 4.0, representing the large mass of data, intensive collection, and importance of artificial intelligence and machine learning to handle this information that can add a lot of value to companies. Represented by the thousands of data produced by the different stages and experiences of the market, big data includes purchase data, online browsing of consumers, media and marketing data, and customer satisfaction with the service and/or product, among other diverse information.
The process of collecting and storing data is complex and analyzing thousands of data becomes an impossible human task. Thus, AI and intelligent models based on machine learning go hand in hand with big data to integrate external market and internal company data in a way that makes forecasting and planning of demand, greater revenue, profit, reduction of waste, and sustainability.
Logistics 4.0
It is clear the advances that industry 4.0 has been allocating. For example various automation in production processes, digitalization of products for testing improvements, speed of information, and implementation of results.
With industry 4.0 comes logistics 4.0, aimed at optimizing the loading and unloading processes of goods. Automation and use of AI in various stages of logistics, such as the organization of products in warehouses made by robots that, through AI, strategically and hierarchically organize products to facilitate and increase the speed of cargo operations.
In addition, it is possible to generate forecasts of events on highways, such as works that interrupt routes, using AI and real-time data analysis. This allows the adoption of a better route in the present time, without relying on the historical past and wasting resources, also resulting in customer satisfaction and speed of delivery. Taking advantage of routes, inappropriate and unnecessary use of vehicles, higher gas emissions, and high fuel and maintenance costs are problems interrupted by logistics 4.0 directives, aiming at more assertiveness, intelligence, sustainability, greater revenue, and consumer and supplier satisfaction.
AI in Retail – Final Thoughts
Implementing AI and machine learning through intelligent models is not an easy and instantaneous task. However, the result of all the preparation and construction of these technologies directed to the specifics of the business will result in several benefits.
The power of AI provides intelligent market insight, demand forecasting with higher hit rates, reduced product loss due to expiration or warehouse saturation, and precision in price adjustments supported by different variables that can influence revenue variation. In addition, through advanced data analysis, it is possible to filter suppliers looking for those that deliver the most results and are more aligned with the company’s values.
These positive points are in line with the use of AI to obtain better sustainable results, aiming at the use of routes, continuous delivery, analysis of better routes, reduction in costs, and gas emissions.
The access generated by advanced analytics and AI of the entire supply chain and operations of the company results in great predictability of risks or failures in the initial stages, preparation, and delivery to the end customer. This power of predictability and intelligent strategies consolidates the idea of risk management in a real-time, drastic reduction of failures and waste, and unified control of the stages of sales, operations, production, and delivery of goods. In short, smarter and more sustainable companies have never been so close to being consolidated. The way forward only depends on preparation and organization for greater intelligence and predictability.
What is Aquarela Advanced Analytics?
Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.
Front-end developer at Aquarela Advanced Analytics, IT technician studying Systems Analysis and Development at Instituto Federal Catarinense (IFC). Enthusiast in React, JavaScript, NodeJS, and learning client-side technologies.
Hi everyone, in today’s demonstration, we are going to show you how Big Data Scenario Discovery can help decision making in a profound way in various sectors. We use AQUARELA VORTX Big Data, which is a tool that is a groundbreaking technology in the machine learning field. The Dataset used for the experiment was presented in the previous post about Big Data country auto-segmentation (clustering). The differences here is that this one also includes the Gini Index (found later on) and removes the electrification rate in rural areas. Also, it seeks systemic influences towards a GOAL, in this case, we selected Human Development Index, previously the segmentation just grouped similar countries according to their general characteristics.
The key questions for the experiment:
How many Human Development Index scenarios exist in total? And which countries belong to them?
Amongst 65 indexes, which of them have most influence to define a High or Low Human Development Index?
What is the DNA (set of characteristics) of a High and Low Human Development scenario?
Alright, hang on for a minute! Before you see the results, take a look at all variables analysed in the previous post. Then try to figure out by yourself using the most of your intuition, what would be the answer to these 3 questions. This is a very fun and very useful cognitive task to scenario validation. OK?
Results after pushing the Discoverer button:
This is the overall distribution of 188 countries, where most of the countries present HDI between 0.65 and 0.75. And very few above 0.90. In total, there are 15 different HDI scenarios, which the first 3 correspond to more than 94% of the total and that is what we are to focus on.
The most common scenario and the average HDI
Countries with the lowest HDI
Countries with the highest HDI
Where are they located?
What factors influence HDI the most and the least?
The list marks the top and bottom 10 factors. The factor Intimate or Nonintimate partner Violence ever experienced 2001-2011 – Was automatically removed from the ranking as it does not correlate with HDI.
What is the DNA of each main scenario?
All factors presented at once. Note that the scales on X axis changes dynamically hovering the mouse on VORTX data scope screen.
Drilling down into the DNA
Under-Five Mortality rates vs HDI
Filtering visualisation by the most relevant factor and HDI (HDI is the focus of the analytics so it has the darker colour. Here we see that countries with the highest HDI have lowest levels of under-five mortality rate.
Gender Inequality Rate vs HDI
Gross National Income GNI per capta vs HDI
Insights and Conclusions of the study
The possibilities generating new knowledge from this Big Data strategy are endless, but we focused on just a few questions and few print screens to demonstrate its value. During this research, we found interesting to see the machine autonomously confirming some previous intuitions, while breaking some preconceptions. It is important to mention that we are not measuring causation as if one factor leads to another and vice-versa, the results show systemic correlations only. Here there are some of them that called our attention:
Gender inequality playing a strong role and inverse correlation in Human Development Index while we are living a transition of the industrial age to information where knowledge if surpassing the physical differences between genders.
Research and development having a direct correlation to HDI.
The United States having its own scenario due to its unique systemic characteristics.
Gross National Income GNI per capita leading the ranking and the values around 40 thousand dollars.
Public expenditure ahead of Education related indexes.
Business applications
Applying the same questions we had at the beginning of the article, now let’s see how they would look like for different business scenarios:
Sales
How many scenarios exist for your sales? Which customer segment belong to each scenario?
Amongst several business factors, which of them have the most influence to define a High or Low revenue?
What is the DNA (characteristics) of a High and Low revenue scenario?
Industry
How many production/maintenance scenarios exist for your production line? Which processes belong to each scenario?
Amongst several production factors, which of them have the most influence to define a High or Low outcome or High or Low maintenance/costs?
What is the DNA (characteristics) of a High and Low production/maintenance scenario?
Healthcare
How many patient scenarios exist for a specific disease or medical condition? Which patients belong to each scenario?
Amongst several patient characteristics, which of them have the most influence to result in High or Low levels of a specific disease or medical condition?
What is the DNA (characteristics) of a High and Low medical condition scenarios?
All in all, we expect that this article can help easy landing on the newest territories of machine learning and in case you need more information on how this solution applies to your business scenario, please let us know. If you found this analytics interesting and worth spreading, do so. Super thanks on behalf of Aquarelas team!
What is Aquarela Advanced Analytics?
Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.
Founder of Aquarela, CEO and architect of the VORTX platform. Master in Engineering and Knowledge Management, enthusiast of new technologies, having expertise in Scala functional language and algorithms of Machine Learning and IA.
Founder – Commercial Director, Msc. Business Information Technology at University of Twente – The Netherlands. Lecturer in the area of Data Science, Data governance and business development for industry 4.0. Responsible for large projects in key industry players in Brazil in the areas of Energy, Telecom, Logistics and Food.
The objective of this post is to show you what happens when we give several numbers to a machine (VORTX Big Data) and it finds out by itself how the countries should be organized into different boxes. This technique is called clustering! The questions we will answer in this post are:
How are countries segmented based on the world’s indexes?
What are the characteristics of each group?
Which factors are the most influential for the separation?
Here we go!
Data First – What comes in?
I have gathered 65 indexes of 188 countries of the world, the sources are mainly from:
UNDESA 2015,
UNESCO Institute for Statistics 2015,
United Nations Statistics Division 2015,
World Bank 2015,
IMF 2015.
Selected variables for the analysis were:
Human Development Index HDI-2014
Gini coefficient 2005-2013
Adolescent birth rate 15-19 per 100k 20102015
Birth registration under age 5 2005-2013
Carbon dioxide emissions Average annual growth
Carbon dioxide emissions per capita 2011 Tones
Change forest percentile 1900 to 2012
Change mobile usage 2009 2014
Consumer price index 2013
Domestic credit provided by financial sector 2013
Domestic food price level 2009 2014 index
Domestic food price level 2009-2014 volatility index
Electrification rate or population
Expected years of schooling – Years
Exports and imports percentage GPD 2013
Female Suicide Rate 100k people
Foreign direct investment net inflows percentage GDP 2013
Forest area percentage of total land area 2012
Fossil fuels percentage of total 2012
Freshwater withdrawals 2005
Gender Inequality Index 2014
General government final consumption expenditure – Annual growth 2005 2013
General government final consumption expenditure – Perce of GDP 2005-2013
Gross domestic product GDP 2013
Gross domestic product GDP per capita
Gross fixed capital formation of GDP 2005-2013
Gross national income GNI per capita – 2011 Dollars
Homeless people due to natural disaster 2005 2014 per million people
Homicide rate per 100k people 2008-2012
Infant Mortality 2013 per thousands
International inbound tourists thousands 2013
International student mobility of total tertiary enrolment 2013
Internet users percentage of population 2014
Intimate or no intimate partner violence ever experienced 2001-2011
Life expectancy at birth- years
Male Suicide Rate 100k people
Maternal mortality ratio deaths per 100 live births 2013
Mean years of schooling – Years
Mobile phone subscriptions per 100 people 2014
Natural resource depletion
Net migration rate per 1k people 2010-2015
Physicians per 10k people
Population affected by natural disasters average annual per million people 2005-2014
Population living on degraded land Percentage 2010
Population with at least some secondary education percent 2005-2013
Pre-primary 2008-2014
Primary-2008-2014
Primary school dropout rate 2008-2014
Prison population per 100k people
Private capital flows percentage GDP 2013
Public expenditure on education Percentage GDP
Public health expenditure percentage of GDP 2013
Pupil-teacher ratio primary school pupils per teacher 2008-2014
Refugees by country of origin
Remittances inflows GDP 2013
Renewable sources percentage of total 2012
Research and development expenditure 2005-2012
Secondary 2008-2014
Share of seats in parliament percentage held by woman 2014
Stock of immigrants percentage of population 2013
Taxes on income profit and capital gain 205 2013
Tertiary -2008-2014
Total tax revenue of GDP 2005-2013
Tuberculosis rate per thousands 2012
Under-five Mortality 2013 per thousands
What comes out?
Let’s start looking at the map, where these groups are, then we go to the VORTX’s visualization for better understanding the DNA (composition of factors of each group).
Click on the picture to play around with the map inside Google maps.
Ok, I see the clusters but know I want to know what is the combination of characteristics that unite or separate them. In the picture below is the VORTX visualization considering all groups and all factors.
On the left side, there are the groups and their proportion. Segmentation sharpness is the measurement of the differences of groups based on all factors. On the right side is the total composition of variables or we can call the world’s DNA.
In the next figures, you will see how different it becomes when we select each group some groups.
The most typical situation of a country representing 51,60. We call them as average countries.
The second most common type representing 26.46% of the globe.
This is the cluster that has the so called first world countries with results are above average representing 14.89% of the globe. The United States does not belong to these group, but Canada, Australia, New Zeeland and Israel.
The US is numerically so different from the rest of the world that VORTX decided to separate it alone in one group that had the highest distinctiveness = 38.93%.
Other countries didn’t have similar countries to share the same group, this is the case of United Arab Emirates.
Before we finish, below I add the top 5 most and the 5 least influential factors that VORTX identified as the key to create the groups.
Top 5
Maternal mortality ratio deaths per 100 live births 2013 – 91% influence
Under-five Mortality 2013 thousand – 90%
Human Development Index HDI-2014 – 90%
Infant Mortality 2013 per thousands – 90%
Life expectancy at birth- years – 90%
Bottom 5
Renewable sources percentage of total 2012 – 70% influence
Total tax revenue of GDP 2005-2013 – 72%
Public health expenditure percentage of GDP 2013 73%
General government final consumption expenditure – Percentual of GDP 2005-2013 73%
General government final consumption expenditure – Annual growth 2005 2013 75%
Conclusions
According to VORTX if you plan to live in another country or sell your product abroad, it would be wise to see to which group this country belong to. If it belongs to the same group you live in, then you know what to expect.
Could other factors be added to removed from the analysis? Yes, absolutely. However, sometimes it is not that easy to get the information you need at the time you need it, Big Data analyses usually have several constraints and typically really on the type of questions are posed to the Data and to the algorithm that, in turn, relies on the creativity of the Data Scientist.
The clustering approach is becoming more and more common in the industry due to its strategic role in organizing and simplifying the decision-making chaos. So how could a manager look at 12.220 cells to define a regional strategy?
Any question or doubts? Or anything that calls your attention? Please leave a comment!
For those who wish to see the platform operating in practice, here is a video using data from Switzerland. Enjoy it!.
What is Aquarela Advanced Analytics?
Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.
Founder of Aquarela, CEO and architect of the VORTX platform. Master in Engineering and Knowledge Management, enthusiast of new technologies, having expertise in Scala functional language and algorithms of Machine Learning and IA.
Founder – Commercial Director, Msc. Business Information Technology at University of Twente – The Netherlands. Lecturer in the area of Data Science, Data governance and business development for industry 4.0. Responsible for large projects in key industry players in Brazil in the areas of Energy, Telecom, Logistics and Food.
In the vast majority of talks with clients and prospects about Big Data, we soon realized an astonishing gap between the business itself and the expectations of Data Analytics projects. Therefore, we carried out a research to respond the following questions:
What are the main business sectors that already use Big Data?
What are the most common Big Data results per sector?
What is the minimum dataset to reach the results per sector?
The summary is organized in the table below.
Conclusions
The table presents a summary for easy understanding of the subject. However, for each business there are many more variables, opportunities and of course, risks. It is highly recommended to use multivariate analysis algorithms to help you prioritize the data and reduce project’s cost and complexity.
There are many more sectors in which excellent results have been derived from Big Data and data science methodology initiatives. However we believe that these can serve as examples for the many other types of similar businesses willing to use Big Data.
Common to all sectors, Big Data projects need to have relevant and clear input data; therefore it is important to have a good understanding of these datasets and the business model itself. We’ve noticed that currently many businesses haven’t been yet collecting the right data in their systems, which suggests the need pre-Big Data projects. (We will write about this soon).
One obstacle for Big Data projects is the great effort to collect, organize, and clean the input data. This can surely cause overall frustration on stakeholders.
At least as far as we are concerned, plug & play Big Data solutions that automatically get the data and bring the analysis immediately still don’t exist. In 100% of the cases, all team members (technical and business) need to cooperate, creating hypothesis, selecting data samples, calibrating parameters, validating results and then drawing conclusions. In this way, an advanced scientific based methodology must be used to take into account business as well as technical aspects of the problem.
What is Aquarela Advanced Analytics?
Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.
Founder – Commercial Director, Msc. Business Information Technology at University of Twente – The Netherlands. Lecturer in the area of Data Science, Data governance and business development for industry 4.0. Responsible for large projects in key industry players in Brazil in the areas of Energy, Telecom, Logistics and Food.
Founding partner and President of Flex Capital Securitizadora S / A – Master in “Business Administration” from the University of California at Los Angeles (UCLA) – Master in Production Engineering at Federal University of Santa Catarina (UFSC).
One of the most frequent questions in our day-to-day work at Aquarela is related to a common misconception of the concepts Business Intelligence (BI), Data Mining, and Big Data. Since all of them deal with exploratory data analysis, it is not strange to see wide misunderstandings. Therefore, the purpose of this post is to quickly illustrate what are the most striking features of each one helping readers define their information strategy, which depends on organization’s strategy, maturity level and its context.
The basics of each involve the following steps:
Survey questions: What does the customer want to learn (find out) of his/her business.3. How many customers do we serve each month? What is the average value of the product? Which product sells best?
Study of data sources: What data are available internal / external data to answer business questions Where are the data? How can I have these data? How can I process them?
Setting the size (scope) of the project: Who will be involved in the project? What is the size of the analysis or the sample? which will be the tools used? and how much will it be charged.
Development: operationalization of the strategy, performing several, data transformations, processing, interactions with the stakeholders to validate the results and assumptions, finding out if the business questions were well addressed and results are consistent.
Until now the Bi, Data Mining and BigData virtually the same, right? So, in the table below we made a summary of what makes them different from each other in seven characteristics followed by important conclusions and suggestions.
Comparative table (Click to enlarge the image)
Conclusions and Recommendations
Although our research restricts itself to 7 characteristics, the results show that there are significant and important differences between the BI, Data Mining and BigData, serving as initial framework for helping decision maker to analysed and decide that fits best they business needs. the most important points are:
We see that companies with a consolidated BI solution have more maturity to embark on extensive Data mining and/or Big Data, projects. Discoveries made by Data mining or Big Data can be quickly tested and monitored by a BI solution. So, the solutions can and must coexist.
The Big Data makes sense only in large volumes of data and the best option for your business depends on what questions are being asked and what the available data. All solutions are input data dependent. Consequently if the quality of the information sources is poor, the chances are that the answer is wrong: “garbage in, garbage out”.
While the panels of BI can help you to make sense of your data in a very visual and easy way, but you cannot do intense statistical analysis with it. This requires more complex solutions along side data scientists to enrich the perception of the business reality, by mean of finding new correlations, new market segments (classification and prediction), designing infographics showing global trends based on multivariate analysis).
Big Data extend the analysis to unstructured data, e.g. social networking posts, pictures, videos, music and etc. However, the degree of complexity increases significantly requiring experts data scientists in close cooperation with business analysts.
To avoid frustration is important to take into consideration differences of the value proposition of each solution and its outputs. Do not expect realtime monitoring data of a Data Mining project. In the same sense do not expect that a BI solution discovers new business insights, this is the role of the business operations of the other two solutions.
Big Data can be considered partly the combination of BI and Data Mining. While BI comes with a set of structured data in Data Mining comes with a range of algorithms and data discovery techniques. The makes Big Data a plus is the new large distributed processing technology, storage and memory to digest gigantic volumes of data with a wide range of heterogeneous data, more specifically non-structured data.
The results of the three can generate intelligence for business, just as the good use of a simple spread sheet can also generate intelligence, but it is important to assess whether this is sufficient to meet the ambitions and dilemmas of your business.
The true power of Big Data has not yet been fully recognized, however today’s most advanced companies in terms of technology base their entire strategy on the power and advanced analytics given by Big Data, in many cases they offer their services free of charge to gathering valuable data from the users. E.g.: Gmail, Facebook, Twitter and OLX.
The complexity of data as well as its volume and file types tend to keep growing as presented in a previous post. This implies on the growing demand for Big Data solutions.
In the next post we will present what are interesting sectors for applying data exploratory and how this can be done for each case. Thank you for join us.
What is Aquarela Advanced Analytics?
Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.
Founder – Commercial Director, Msc. Business Information Technology at University of Twente – The Netherlands. Lecturer in the area of Data Science, Data governance and business development for industry 4.0. Responsible for large projects in key industry players in Brazil in the areas of Energy, Telecom, Logistics and Food.