Page 1 of 2
1 2

What’s new about generative AI in a business context?

I’ve spent the last eight years working with AI, learning the ins and outs of building and applying AI solutions in business. After making countless mistakes, I created my own method for building and applying the technology.

That was fine and dandy until the fall of 2022, when ChatGPT was released and gave a sudden rise in the usefulness and adoption of generative AI. For my consulting business TodAI, that meant a lot of new projects involving generative AI and a lot of learning. After several projects, I’ve identified places where generative models are clearly distinct from other AI when applying them in a business. Some are small, and others are very significant.

How do these new generative AI models change the game for applied AI?

Terminology

It’s easier to discuss the changes if we distinguish between generative AI and predictive AI.

Generative AI refers to large pre-trained models that output texts, images or sounds from user-provided prompts. The output is (potentially) unique and mimics human-generated content. It’s based on the prompt and the data used to train a large pre-trained model. Text-generating models such as OpenAI's GPT or Googles Bard are also known as large language models (LLMs).

Predictive AI comprises models that output one or more labels (prediction or classification) or numbers (regression or time series). It includes:

●      Image building blocks: image classification, object detection and image segmentation

●      Tabular building blocks: prediction, regression and forecast

●      Text building blocks: text classification, named entity recognition and Intent analysis

An alternative and more accurate name for predictive AI in an academic sense is discriminative AI. However, I use “predictive” as it might resonate better with most people.

Generative AI can also predict

Generative AI (such as GPT) also can be used to solve predictive problems. ChatGPT can learn to classify texts through a few examples (few-shot learning) or no examples at all (zero-shot learning). The functionality might be the same, but there’s a technical difference. Generative AI doesn’t need you to train an algorithm that produces a model that can then classify. Instead, the generative model gets the examples as a part of the prompt.

The upside of using the generative models for predictive tasks is that implementation can be done immediately. However, there are downsides, such as:

●      no way to calculate the expected performance through (for example) accuracy measures

●      the generative model might provide an output that isn’t a part of the list of provided labels

●      each prompt for output might affect future output

●      generative models tend to "forget" the initial examples as they have a limit for how many prompts they can remember

Knowing how well an AI solution works is more complicated and takes more time

A good rule of thumb is that if you can’t achieve an OK accuracy of your model within 24 hours of work, you either have the wrong data or the wrong scope.

For example, a model predicting housing prices that predicts with 50% accuracy after 24 hours of modelling work will never see more than 60% or 65% accuracy, no matter what clever algorithm or fine-tuning you apply. If 60% isn’t good enough for your business case, you need to acquire more, other or better data, or change your business scope.

Following the 24-hour rule means that AI solutions that will never work are spotted early and scraped or redefined. The 24-hour rule has saved me from countless embarrassing failures, and it works as accuracy is an excellent indicator (not equal to, though) for the business value you can expect.

But that rule is no longer helpful in generative AI as there are no accuracy measurements during development. For example, if your business case is generating sales emails for a group of sales reps, you can’t measure the "accuracy" of the output. The business outcomes you’re trying to achieve might be faster communication with clients (through writing speed) or more sales (through better emails). These outcomes are hard to measure during development. Writing speed in particular is challenging to measure, as the output has to be checked and edited by a sales rep, and testing that speed requires involving the sales rep.

Generative AI requires closer domain expert collaboration

The result of this challenge is that domain experts must be involved closely in the development process to help adjust the output and measure the effect on the business outcome you’re trying to achieve. The days when you could rely on data scientist training and fine-tuning alone until a satisfying solution is ready are over.

Picking use cases should be based on ease of testing

Delloite’s helpful guide about generative AI suggests that use cases for generative AI should be based on the effort it takes to validate output and the effort it would take a human to generate the same content.


Deloitte, Generative AI is all the Rage


The graph shows that the less effort it takes for a human to validate the output, the better the use case. Delloite works from a production point of view, but the same dynamic goes for testing the solution during development. If testing how a generative model's output affects the desired business outcome is too challenging, the use case might not be suitable.


So here's a new version of the 24-hour rule:

If you can’t test how your generative AI output affects the business outcome, stop the project or redefine the scope.


Generative AI benefits experts more than the novice users

Predictive AI use cases have often been about creating decision support systems where models trained on historical data could predict what the experts already know. For example, an AI solution that predicts what account an invoice should be recorded on can be trained on historical work from professional accountants and used to help small business owners do their accounting.

In generative AI, the models’ output is hard to access for novice users who won’t necessarily know if the AI is offering confident but wrong output. An example could be a solution that drafts a contract. Novice users with little to no legal experience won’t know if a clause is illegal or bears hidden risks. An experienced legal expert, on the other hand, will be able to weed out the confident-but-wrong output and pick up on helpful output, like a rare but relevant clause in a specific case.

This is a massive game-changer for AI. Not only does it mean that use cases for AI go towards another target audience (experts), it also suggests that value creation will be a more significant part of AI business cases than saving costs.

Implementing the decision model

A crucial part of productively applying predictive AI has been to make a decision model for mapping output, confidence and actions. For example, you could take a classic churn model that predicts how likely a customer is to cancel their agreement (churn) with a company. Mapping out the actions to take given a prediction (churn or not churn) is a crucial first step in building the solution.

In fact, no code or data acquisition should be done if the decision model can’t be made. The most common cause of AI solutions not going into production is that the business can’t agree on what actions to take given a prediction and an output.

Suppose the churn model predicts a customer has a 60% likelihood of churning in the next three months. How do you handle that? Do you provide a discount? Call the client? Simply accept it and use it as a forecast? As most businesses get stuck in strategic, legal or other discussions, AI solutions never surpass a prototype. Too bad if you invested a lot of time and money in development just to get stuck.

But it’s not as easy to make a decision model for a solution based on generative AI. First, the model output doesn’t go into simple labels, such as churn or no churn. Instead, it’s just a piece of text or an image. At the same time, there's no confidence score. The result is that you can only build a decision model with one output and one action.

On the bright side, the decision model or process flow will get simpler, although it also creates challenges. As you don’t know the quality of the output, and the output essentially can be anything, it’s tough to create fully automated systems. A confidence threshold is often set for predictive AI, and anything above that threshold triggers an automated action. For example, the above-mentioned accounting model could automatically act as a bookkeeper if confidence was above 98%.

One fix for confidence is when generative AI uses data from the source material for inspiration. That could be a database of contract templates or support articles. In this case, you can get a confidence score on the model's certainty that the source material was a good match for the prompted request.

More business, less code

Developing AI solutions based on generative AI requires less code than predictive AI solutions. Generative AI solutions don’t need training, complicated prediction pipelines, or retraining flows. The output is often presented as-is to the user and not post-processed as much as in predictive AI solutions. This requires less code, and the development time is shorter. That, in combination with the above-mentioned challenges to applying generative AI in business contexts, results in more business work and less code. It results in a shift towards a need for business roles such as UX, solution designers or product managers, and a reduced need for developers and data scientists. Both are still important, but in terms of hourly expenditure, the business side goes up and the technical side down.

Conclusion

In my experience, the implementation of generative AI is faster and more impactful than predictive AI, but it also requires more business understanding and courage from the business implementing it.

The added requirements for business understanding will be especially interesting to observe in future AI cases, as they’ve historically been under-prioritised by most teams delivering AI solutions.

How to build a working AI only using synthetic data in just 5 minutes

Synthetic data is on the rise in artificial intelligence. It's going to make AI cheaper, better and less biased.

It's also very obtainable and usable. In a short while, it has gone from being an experimental technology. To something, I would not hesitate to use for production AI solutions. 

To illustrate that, I will build an AI that can classify the difference between apples and bananas. I will only use images of the two classes generated by another AI - In this case, using DALL-E Mini.

An Apple or Banana recognizer 

I will build an image classifier using only easy-to-access, free AutoAI tools.

Generating data

We need around 30 images of each label, bananas and apples.

We will be using DALL-E Mini, an open-source version of NVIDIAs text-to-image model DALL-E 2.

To generate the images, you can go to https://huggingface.co/spaces/dalle-mini/dalle-mini. Here you can prompt the text-to-image model with queries such as:

"Banana on table"

"Banana on random background"

"Apple on table"

"Apple on random background"

Try to match the background you will be testing on.

The response should look like this:

Generate around 30 images of each label and save them.

Creating the model

To create the model, we will be using the tool Teachable MachineTeachable machine has 

Open Teachable Machine and choose Image project and then Standard image model.

Names your classes Banana and Apple.

Upload your images from DALL-E Mini to the respective classes.

Press Train. Training will only take a few seconds.

Testing the model

Now you can try the model showing a banana or an apple via the webcam view.

As you can see, it works. The model has been trained only using synthetic data. I think it's remarkable that training models work with only synthetic data.

It suggests a future with significantly lower costs of building AI solutions. That means more people can build AI and utilize the possibilities of the technology.

A fascinating future!

For more tips, sign up for the book here: https://www.danrose.ai/book.

What is synthetic data for artificial intelligence?

This article is a cutout of my forthcoming book that you can sign up for here: https://www.danrose.ai/book.

Synthetic data in AI is probably the subject I think the most about currently, to be honest. It has enormous potential to improve privacy, lower bias and improve model accuracy simultaneously in a giant technological leap in the coming years. Gartner even stated, "By 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated.". That is a game-changer considering that many people working with AI today haven't even started to adopt this technology. 

Synthetic data is data but not actual observations of the world. It is fake data either created by humans or algorithms. It is created artificially or synthetically, but the goal is the same as real data - To represent the world in which the AI is supposed to function. The idea that data for training AI models should accurately represent the world is still a means to the end. Ultimately, the goal of building AI is models that accurately predict to provide a good user experience.

Types of synthetic data

Depending on the data type, text, images and tabular data, there are different approaches and use cases.

Synthetic texts

For language and text AI, you can generate synthetic texts that look like those you would find in the real world. It might even look like gibberish to a human, but if it does the job of representing the world when used for training data, that's good enough. 

I have implemented that approach before in a text classification case. I chose this approach because the data could only be stored for three months, making it hard to keep up with seasonal-specific signals. I took the real data that I fed to a language model and fine-tuned the model so that it could produce similar data to real data. We could then generate unlimited data for each label without personal data to train the AI models.

Synthetic images

For images, it's possible to use a text-to-image model that can create synthetic images simply by being prompted by a user with a text. The most famous version of this is NVIDIAs DALL-E 2 model that produces amazingly realistic pictures. An open-source version, available on HuggingFace, called DALL-E Mini, can be tried for free here: https://huggingface.co/spaces/dalle-mini/dalle-mini. You can prompt the model with a short text like "squared strawberry", and you get nine attempts from the model to produce an image of a squared strawberry.

As the model is open-source, you can also download the model and use it for your projects.

The images produced by DALL-E Mini might not be photo-realistic, but it's still good enough to train AI models. 

You can try it yourself. Go to the DALL-E Mini and query the model to make images of bananas and apples. Use sentences such as "Banana on table" or "Banana on random background". Do the same with apple until you have 30 or so images of each. You can now upload these images to Teachable Machine to make a banana vs apple recogniser. I promise it will work. If it does not impress you just a tiny bit that you can build AI to recognise objects from purely synthetic images, then I don't know what does.

The use cases here are many. You can synthetically create objects you expect but have not seen in the training data. You can also bring ordinary objects to random backgrounds to make sure you cover unknown scenarios. That will also increase the quality of the models as a change in environment will matter less.

Synthetic tabular data

Tabular data is also possible to generate synthetically. That is popular in healthcare as healthcare is very vulnerable to data issues. Besides the endless combination of scenarios with different diseases and medicine interacting, there's also the privacy issue. Data from one patient's history of diagnostics and medication can be so unique that it can identify individuals. By generating synthetics versions of the actual data, the data can be extended to cover rare scenarios better and anonymise the data. That makes it easy to share between researchers and medical experts.

Models of the world

With synthetic models of the world, we can also experiment with AI solutions before we release them and teach them to become better at a fraction of the cost. Self-driving cars are a perfect use case for this. Self-driving cars can be developed faster and safer by building a synthetic model of the world close to the real world with physics and random scenarios. Many companies building self-driving cars today use models built in the engine Unity, initially intended for computer game development. Cars can try, crash and improve with no humans at risk in a virtual world millions of times before being released.  

The good and the bad of synthetic data

The benefits of applying synthetic data to your solutions are many. It can provide more data at a lower price to improve the accuracy of models. It can remove bias by evening out the data by adding to the otherwise rare features or labels that would be a disadvantage for some groups. It can also improve the privacy of people whose personal data might be part of the training data. It can also let us test known and unknown scenarios. 

But is it all good? No. Synthetic data is not a silver bullet. It does come with the risk of adding to bias or bringing the data further from the world it is meant to represent. The challenge is that it is difficult to identify the cause of bias as synthetic data is often used where the real data is in shortfall and, by definition, challenging to reality-check. Synthetic data is a promising solution to many problems, but use it with care. As very few have experience in synthetic data in AI, we are unaware of many of the challenges that await.

For more tips, sign up for the book here: https://www.danrose.ai/book.

Black Swans in Artificial Intelligence

This article is a cutout of my forthcoming book that you can sign up for here: https://www.danrose.ai/book

A significant concept in understanding your data is the concept of Black Swans. The black swan theory was coined by statistician and author of Fooled by Randomness Nassim Nicholas Taleb. A book I can only recommend.

For many years it was commonly known that black swans did not exist. As black swans had never been observed, they did not exist in any data. Had you at that time put your bid on the chance that the next swan you saw would then be black, you would probably bet against such an event. It turned out that there were lots of black swans. They just had not been observed yet. They first became so when we discovered Australia, which was full of black swans. In other words, the data only represented the known and observed world and not the actual world. 

That is also an excellent time to mention that data is only historical. And as the notion goes in data science, historical data is quite bad, but the best we got.

Black Swans are, on an individual level, very rare. The latest Covid pandemic is a testament to a rare black Swan event. 

One could also mention the financial crisis in 2007. Historically, the housing market never crashed, so any model built on historical data could not have predicted such an event.

But Black swans are not rare on an aggregated level. They are more common than you would intuitively think. Pandemics, crashing housing markets or war in Europe seem like such unique events that they must be uncommon. But less media-attractive firsts and rare events happen all the time. 

As such, you should also expect black swans events. Whatever your data shows from historical events is just history. Relying on it should be done with the knowledge that the future can be vastly different. That also translates to the accuracy of models. As AI models use parts of the historical training data to calculate their accuracy, they do so with the past in mind. As a result, you should always expect AI models to perform at least a bit worse in production than they provide as accuracy.

You should also not try to predict black swans. They are, by nature, unpredictable. Instead, make sure the processes. Especially the Decision models and decision levels consider that a black swan can appear at any minute.

For more tips, sign up for the book here: https://www.danrose.ai/book

The interview guide for domain experts in AI

This article is a cutout of my forthcoming book that you can sign up for here: https://www.danrose.ai/book

When interviewing domain experts for artificial intelligence solutions, it's essential to avoid discussing a specific solution but instead focus on the business outcome and the problem at hand. When you interview experts, they sometimes settle on a particular solution too early, even without knowing it. As the solution architect, you might also do the same and miss out on better alternatives. I often catch myself doing that as finding the perfect solution is the most satisfying part of the discovery phase. To focus on the problem and business outcome, I use the following guide as inspiration for questions.

Question: Tell me about the last time you did X (E.g. forecasted sales or did shift planning at the ice cream store)

The question works better than "How do you do forecasting?". Asking this way will provide you with a polished best-case answer. The subject matter expert will tell you how everything is supposed to be done. We all want to present our best version of ourselves, and we can be a little afraid of admitting that we jump hoops when we are busy or things are a little messy. But we are all busy, and everyday work is messy. Teresa Torres has a great example in her book "Continues Discovery Habits.": When you ask people how they buy jeans, they will tell you that they go by brand and quality. When you ask them how they bought jeans the last time, they will tell you that there was a nice discount. 

When building AI, you are looking to identify all the mess and procedure bypassing. That is where you will face challenges, and can you decrease these with AI; you can provide much value.

Question: How will you use the information provided by the AI? (E.g. Information about how many ice creams are sold on a given day)

That question focuses on the business need and outcome and not just the wish for the information or the technical solution. The value in any AI can be found in what action we decide on based on the information provided by the model. Uncovering the intended actions reveals the potential value of the AI solution. It also exposes the reasoning (And sometimes the lack of) behind the need for the AI solution.

Question: How would the solution help your new colleague?

Experienced employees can have a hard time seeing the idea of assistance (from AI or not). They can always find a solution to challenges. They don't need help. But when their inexperienced colleagues become the subject, they have an easier time seeing the value and can explain how a solution will help them.

Question: Why can't you solve this problem in any other way than AI?

That will often result in the subject telling you how they think AI will solve the problem. It uncovers potential misunderstandings about what AI can and cannot do.

It also uncovers how well thought through the idea is. Is AI just solutions chosen due to the hype, or have alternatives seriously been considered? Don't be afraid to challenge the idea of using AI. Any good decision can stand that test and is it not a good decision, you will know at some point no matter what. Better sooner than later.

Question: Why will this solution fail?

Have you ever heard people say: "I knew that would fail"? If that is true, even occasionally, then asking this question can save you trouble. You might also know the feeling that you ignored the signs of challenges when you were too excited about a solution. I certainly do.

When asking this question, I often get the answer: "We will fail because we will try to solve everything and not get it done." That is a usual challenge and making the subjects say this brings some realism to the project.

Question: Show me how you do X?

Make the person show you how they do their work. Observing a subject's actions will uncover intangible knowledge. What has become type 1 and routine for the subject will confuse you, and you can point that out and ask what is going on.

Question: What will be hard about (X, Y, Z)?

I often ask questions such as "What will be hard about getting a high accuracy?" or "What will be hard about onboarding users to the solution?". Questions like that uncovers will uncover data features that might not be as trustworthy as you thought. Answers like "We changed the way we log data for X recently" are typical here.

For tips, sign up for the book here: https://www.danrose.ai/book

The three AI adoption strategies

AI comes in many different shapes and sizes. That applies to the use cases, the underlying technologies as well as the approaches to adopting AI in your organization. As many organizations are looking to adopt AI, an increasing need for tangible frameworks to understand the technology in a business perspective is requested by leaders in all industries.

 

Some of the key questions asked by leaders are simple. How much time and money is required to adopt AI and solve business problems via AI and what returns do we get for those efforts? That is more than reasonable questions but answering these questions have been an issue in two parts. Firstly the answers have been a moving target with the technology being in an exponential development and as a result the answers of yesterday seem antique today. Secondly the intangible and explorative nature of AI has made it hard to provide such answers at all.

 

But as AI has matured as a technology, and been packaged into products and ready-to-use solutions, these questions are ready to be answered. The products and solutions might come in different levels of abstractions but they are nevertheless ready for being applied to business problems without much hassle.

 

The three main AI approaches

 

To make it easy to understand the efforts and the outcomes of AI it can be divided into three core approaches; Off-the-shelf-AI, AutoAI and Custom AI. The idea is simple. AI has reached a point where some solutions are ready to use out of the box and others need a lot of work before being applied. All approaches come with their own benefits and drawbacks so the trick is to understand these properties and know when to apply what kind. These core AI adoption strategies provide a more concrete foundation for predicting costs, risks and returns when applying AI.

 

 

Off-the-shelf-AI



Some AI solutions are ready to use out of the box and need little to no adjustment. Examples can be the Siri in your iPhone, an invoice capture software or speech-to-text solutions. These solutions take minutes to get started and the business models are often AI-as-a-Service making the initial investments low. Often these services are pay-per-use models and consequently implies low risk. The challenge of course is that you get what you get and you shouldn’t get upset. The options for adjusting and making necessary changes for the business problem are usually as low as the costs. More and more of these AI-services are blooming in the AI ecosystem with the large cloud providers as Google and Microsoft taking the lead.

 

AutoAI



Also known as AutoML, a more technical name, this solution is the hybrid solution giving both freedom to shape the AI as one wishes to a certain extent but also not having to invent the wheel once again. With AutoAI a business can take it’s own data such as documents, customer data or even pictures of products. This data is then used to train AI’s in pre-made environments that have the ability to pick the right algorithms for the job and deploy the AI ready to use in the cloud. As it can be costly to acquire data so there are some efforts required with AutoAI but at least it rarely requires a small army of data scientists. The drawback is also the inflexibility that  is inherent in standardized tools. AutoAI also will be challenged when aiming for an AI with the highest possible accuracy.

 

Custom AI


With Custom AI almost everything is built from scratch. It is a job for data scientists, machine learning engineers and more of a task for R&D than any other place in the organization. The Custom AI approached is usually the weapon of choice when extremely high accuracy is required. Everything can be built and the possibilities are endless. This also usually means at least months or even years of work and experiments. Costly and time consuming.  As the AutoAI and Off-the-shelf-AI is becoming more and more available and advanced, Custom AI is more suitable for companies building AI solutions that compete with other AI solutions. With all these extra efforts you might get the small edge that will win you the market.

 

 

 

A final note

For solving problems with AI the most usual approach is advancing towards off-the-shelf and AutoAI. An even more likely future is that a combination of these will be the favorite choice for many organizations adopting AI.

 

The concept of these approaches is not unique to AI. Almost all other technologies have been through the same natural progression and now the time has come to AI. It is a sign of maturity and that AI is in a state of public property and no longer hidden behind the ivy walls of the top universities.

 

Of course these approaches are not set in stone and the boundaries between them are fluid and inconsistent. But applying this framework of approaches to the conversation when adopting AI helps the almost magic aura of artificial intelligence become closer to a tool in the toolbox business. And that is where the value starts mounting.

Lobe.ai Review

Lobe.ai just released for open beta and the short story is that you should go try it out. I was lucky and got to test it in the closed beta so I figured i should review a short review.

Making AI more understandable and accessible for most people is something I spend a lot of time on and Lobe is without a doubt right down my alley. The tagline is “machine learning made simple” and that is exactly what they do.

Overall great tool and I see it as an actual advance in the AI technology by making AI and deep learning models even more accessible than the AutoML wave is already doing.

So what is Lobe.ai exactly?

Lobe.ai is an Automl tool. That means that you can make AI without coding. In Lobe’s case they work with image classification only. So in short you give Lobe a set of images with labels and Lobe will automatically find the most optimal model to classify the images.


Lobe is also acquired by Microsoft. I think that’s a pretty smart move by Microsoft. The big clouds can be difficult to get started with and especially Microsoft's current AutoML solutions is first of all only tabular data but also requires a good degree of technical skills to get started.

It’s free. I don’t really understand the business model yet, but so far the software is free. That is pretty cool but I’m still curious about how the plan on getting revenue to keep up the good work.

Features

Image classification

So far Lobe only has one main feature and that’s training and image classification network. And it does that pretty well. In all the tests I have done I have gotten decent results with only very little training data.

Speed

The speed is insane. The models are being trained in something that seems like a minut. That’s a really cool feature. You can also decide to train it for longer to get better accuracy.

Export

You can export the model to CoreML, TensorFlow, TensorFlow Lite and they also provide a local API. 

Use Cases

I’m planning to use Lobe for both hobby and commercial projects. For commercial use I’m going to use it for the main three purposes:

Producing models

Since the quality is good and the export possibilities are ready for production I see no reason not to use this for production purposes when helping clients with AI projects. You might think it’s better to hand build models for commercial use, but my honest experience is that many simple problems should be solved with the simplest solution first. You might stay with a model build with Lobe for at least the first few iterations of a project and sometimes forever.

Teaching 

As I’m teaching people about applied AI, Lobe is going to be one of my go to tools from now on. It makes AI development very tangible and accessible and you can play around with it to get a feeling about AI without having to code. When project and product managers get to try developing models themself I expect a lot more understanding of edge cases and unforeseen problems.

Selling

When trying to convince a potential customer that they should invest in AI development you easily run into decision makers that have no understanding about AI. By showing Lobe and doing live tests I hope to be able to make the discussion more leveled since there’s a chance we are now talking about the same thing.

Compared with other AutoML solutions

The bad:

Less insights 

In short you don’t get any model analysis like you would with Kortical for example.

Less options

As mentioned Lobe only offers image classification. Compared to Google Automl, that does that and objectregocnition, text, tabular, video and so on, it is still limited in use cases.

The good:

It’s Easy

This is the whole core selling point for Lobe and it does it perfectly. Lobe is so easy to use that it could easily be used for teaching 3rd graders. 

It’s Fast

The model building is so fast that you can barely get a glass of water while training. 

The average:

Quality

When I compared a model I build in Google AutoML to one I build in Lobe, Google seemed to be a bit better but not by far. That being said the Google model took me 3 hours to train vs. minutes with Lobe.

Future possibilities

For me to see Lobe.ai can go in two different directions in the future. They can either go for making a bigger part of the pipeline and let you build small apps on top of the models or they can go for more types of models such as tabular models or text classification. Both directions could be pretty interesting and whatever direction they go for I’m looking forward to testing it out.

Conclusion

In conclusion Lobe.ai is a great step forward for accessible AI and already in it’s beta it’s very impressive and surely will be the first in a new niche of AI. 

It doesn’t get easier than this and with the export functionality it’s actually a good candidate for many commercial products.

Make sure you test it out, even if it’s just for fun.

How to lead AI projects

Artificial intelligence is becoming mainstream and many organizations, startups and big corporates alike, are now starting internal AI projects or are incooperating AI into other existing IT. There's just one problem. Leading AI projects are very different from leading in traditional IT. 

As a result many AI projects either fail or ignite frustration with project participants, users of the AI and the management involved. They are unaware that they are now in a new paradigm and should have different expectations than usually have to IT. Even implementing off-the-shelf AI components or systems can cause problems that are not usual to the organization.

This is very understandable. A few years ago AI was still something a few big techs and the universities were engaged with but for the masses a distant future. So very few leaders and project managers have experience in the AI-field, that they are suddenly thrown into

So what is so different about AI projects? 

The keyword here is uncertainty. The big difference between traditional IT and AI is how much certainty is available. The low certainty is due to the experimental nature of AI. The AI paradigm is experimental in the sense that you can’t predict the road to the finished product. You can't plan the inner workings of AI models before you have created it and you don't know exactly what data you will need or how much data. Lastly you don’t know how well the AI will work. So setting up expectations for the finished solution can be very difficult. In many ways AI development is very much comparable with vaccine development. It's impossible to know in advance if your project will even be successful and most of the insights you need will be acquired while developing. This is very much in contrast the IT paradigm we know. The consensus of traditional IT projects is to precise planning, estimation and achieve a preset list of business objectives as timely and accurate as possible. For that reason we have developed an expertise in for example, planning tools and estimation techniques. But suddenly with the entrance of AI a lot of these skills are no longer useful. In fact they can be downright damaging in an experimental paradigm. If management demands deadlines and accurate estimates the project is bound to fail as it will never be able to deliver.

The first step is admitting

In order to lead in the AI domain you first of all, must acknowledge that this is a new paradigm and you must speak about it openly. When working in a new paradigm the most important tool is to be vocal about the new rules of engagement. With very little certainty in AI, expectation management is already approaching an art. So if you are not even in a dialogue with the project stakeholders about how the new way of working is, expectations will never align. So be very clear about this up front. Even as you’re making the business case you have to be clear that you cannot know either the costs nor revenue generated by an AI project. Not everyone will like it and you will see resistance but it’s much better that you take the conflicts before the project starts.

It's a culture thing

The enabler to get acceptance from stakeholders to work in this kind of uncertainty is the right organizational culture. As a leader it is your responsibility to massage and try to shift the culture in a direction that works with experimental AI projects. If there’s a mismatch between the paradigm you work in and the culture, you will get in trouble in no time.

One of the important features in an experimental culture is the willingness to accept null results exactly like the scientific community does. This is not just the usual preach about accepting failure and mistakes. This is a culture that sees a lot of hard work amounting to no more than the knowledge that a specific solution is not viable.

The experimental culture that fits AI development is very much in line with the learning culture, a cultural style found in the eight distinct culture styles of corporate culture. Other cultural styles such as Results culture and Safety culture can be in sharp contrast to the learning style on crucial points when working with AI. The styles are respectively very keen on achieved results and accurate planning. With AI that offers no certainty for either results or predictability this can quickly conflict.

Be visionary

When leading under uncertainty, leading through a strong vision is very effective. Leading through a strong vision is very much in line with the Purpose culture style. I like to compare it to Columbus' journey to America. Not knowing what to expect on the way or if the journey would even see any kind of success, Columbus still managed to get funding and a reliable crew. Columbus was well known for being extremely strong in his vision and I would attribute at least some of his voyage success to a strong vision. The trick here is to be specific about what you envision on the other side of uncertainty. How will everything look and feel when the project is done?

In conclusion it’s very effective to actively use the culture to support AI development since the alternative might be that it will have that working against you. And organizational culture is definitely one of the stronger forces of the universe.

Don’t be data-driven in AI

Being data-driven is usually used and understood with positive connotations but when I hear the word used I get a little anxious about the “data-driven” decisions that might be about to happen. Let me explain why.

According to Wikipedia data-driven means “The adjective data-driven means that progress in an activity is compelled by data, rather than by intuition or by personal experience.” In other words - Look at the data as a primary source of information to act on. When the data gives you a reason to act, you act. It might at a glance seem like a very sound way to work and especially in the AI domain, that in so many ways rely on data. But in fact being data-driven can be very problematic when working with AI. I actually think people that say they are data-driven in general are on the wrong track. This does not mean that I’m against putting much effort into understanding your data. I’m actually a big believer that collecting, understanding and preparing data for AI projects should be the activities with the most resources allocated to it. So I’m pro good data science but against being driven by data and I see that as two very different things.

But then why is it so problematic to be data-driven?

My primary argument is that the driver behind decision making and activities should not be the data you have, but rather curiosity on the problem and the world around it. In a sense that would mean being driven by data you don't have. The final goal of AI projects often is to solve a problem or improve a process and the solutions to that do not always exist in the data you have generated or are being generated by the current world's solutions. So instead you should be curiosity-driven or at least problem-driven. This means that you should not approach problems by looking at your data and making a conclusion. You should look at your data and look for the blindspots and from there be curious. What is it that you don’t know? I’ll get back to curiosity later. First I have some more arguments against being data-driven.

You will extremely rarely have all relevant data to a problem. Even after exhausting all potential data sources. So when you make conclusions from the data you have, the conclusion will at least always be a bit off. This doesn’t mean that data is not useful and that the conclusion is not useful, but you will always be at least a bit wrong. As statisticians would say:"all models are wrong but some are useful". 

Another problem with being data-driven is that there's a narrative that decisions made on data is better than decisions made on gut feeling. And while that might be true sometimes, data is not one-sized and can be very helpful at times and very misleading at others. 

An example is the father of modern statistics Ronald Fischer that also in hindsight was a little too data-driven. He stubbornly held to his conclusion that data showed that lung cancer was not a result from smoking. The correlation he said must be the other way around and people with lung cancer or higher risk of lung cancer was just more likely to be smokers. He argued that it was either a genetic relation or that cancer patients would use smoking to soothe pain in lungs.So even the best statisticians can be told stories that are far from the truth by data. 

The last problem with data is its ability to tell you the story you want it to tell. That can be done consciously or unconsciously. A famous quote by the economist Ronald Case goes “If you torture the data long enough, it will confess to anything” so there no certainty that the conclusion you get from data is correct. The interpretation can be very biased and sometimes we torture the data even without being aware of it ourself. 

About curiosity

So as promised I’m getting back to being curious. If I had to choose one keyword to succeed with AI it would be curiosity. AI projects usually start with a process to optimize or a problem to solve and before training a model on data you have to be curious about the problem. In that way the data comes subsequently to the problem and will as a result be more relevant and more specific to the problem. 

Curiosity to me means exploring with as little preconception as possible. The best example for me is when children lift up rocks on the ground just to see what is under the rock. If you ever saw a child doing that you will have seen that there is no expectations, only excitement both before and after the rock and been lifted. And that is exactly what curiosity does to the practitioner. It leads to excitement that in turn leads to passion. Passion makes everything much easier and even the tedious parts of a project will feel effortless.

AI is also explorative in its nature and that’s why it suits so well to be curious. If there's specific expectations in an explorative process then disappointed is almost given. 

As a result you must let curiosity be the primary driver behind the decisions and activities you make. Being data-driven is reactive in nature and if you want to be innovative in solving problems you must be proactive. Being proactive requires you to be curious about your blind and be driven by the unknown.

AutoML solutions overview

Introduction

 

I have been looking for a list of AutoML solutions and a way to compare them, but I haven’t been able to find it. So I thought I might as well compile that list for others to use. If you are not familiar with AutoML read this post for a quick introduction and pros and cons.

 

I haven’t been able to test them all and make a proper review, so this is just a comparison based on features. I tried to pick the features that felt most important to me, but it might not be the most important for you. If you think some features are missing or if you know an AutoML solution that should be on the list, just let me know.

 

Before we go to the list I’d just quickly go through the features and how I interpret them.

 

Features

Deployment 

Some solutions can be auto deployed directly to the cloud with a one-click deployment. Some just export to Tensorflow and some even have specific export to edge devices.

Types 

This can be Text, Images, video, tabular. I guess some of the open source ones can be stretched to do anything if put in the work, so it might not be the complete truth.

Explainable 

Explainability in AI is a hot topic and a very important feature for some projects. Some solutions give you no insights and some gives you a lot and it might even be a strategic differentiator for the provider. I have simply divided this feature into Little, Some and Very Explainable.

Monitor 

Monitoring models after deployment to avoid drifting of models can be a very useful feature. I divided this into Yes and No.

Accessible

Some of the providers are very easy to use and some of them require coding and at least basic data science understanding. So I took this feature in so you can pick the tool that corresponds to the abilities you have access to.

Labeling tool

Some have an internal labelling tool so you can directly label data before training the model. That can be very useful in some cases.

General / Specialized

Most AutoML solutions are generalized for all industries but a few are specialized to specific industries. I suspect this will become more popular, so I took this feature in.

Open Source

Self-explanatory. Is it open source or not.

Includes transfer Learning

Transfer learning is one of the big advantages of AutoML. You get to piggyback on big models so you can get great results with very little data.

 

AutoML solutions list

 

Google AutoML

 

Google AutoML is the one I’m the most familiar with. I found it pretty easy to use even without coding. The biggest issue I’ve had is that the API requires a bunch of setup and is not just a simple token or Oauth-based authentication.

 

Deployment: To cloud, export, edge

Types: Text, Images, Video, Tabular

Explainable: Little

Monitor: No

Accessible: Very

Labeling tool: Used to have but is closed

General / Specialized: Generalized

Open Source: No

Includes transfer Learning: Yes

Link: https://cloud.google.com/automl

 

Azure AutoML

Microsoft's cloud AutoML seems to be more Xplainable than Google’s but with only tabular data models.

 

Deployment: To cloud, some Local

Types: Only Tabular

Explainable: Some

Monitor: No

Accessible: Very

Labeling tool: No

General / Specialized: Generalized

Open Source: No

Includes transfer Learning: Yes

Link: https://azure.microsoft.com/en-us/services/machine-learning/automatedml/

Lobe.AI

This solution is still in beta but works very well in my experience. I’ll write a review as soon as it goes public. Lobe is so easy to use that you can let a 10-year old use it to train deep learning models. I’d really recommend this for education purposes.

 Deployment: Local and export to Tensorflow

Types: Images

Explainable: Little

Monitor: -

Accessible: Very - A third grader can use this

Labeling tool: Yes

General / Specialized: Generalized

Open Source: No

Includes transfer Learning: Yes

Link: https://lobe.ai/

 

Kortical

Kortical seems to be one the AutoML solutions that differentiates itself by being as explainable as possible. This can be a huge advantage when not just trying to get good results but also understand the business problem better. For that I’m a bit of a fan.

Deployment: To cloud

Types: Tabular

Explainable: Very

Monitor: No

Accessible: Very

Labeling tool: No

General / Specialized: Generalized

Open Source: No

Includes transfer Learning: Not sure

Link: https://kortical.com/

DataRobot

A big player that might even be the first pure AutoML to go IPO.

Deployment: To cloud

Types: Text, Images and Tabular

Explainable: Very

Monitor: Yes

Accessible: Very

Labeling tool: No

General / Specialized: Generalized

Open Source: No

Includes transfer Learning: Yes

Link: https://www.datarobot.com/platform/automated-machine-learning/

 

AWS Sagemaker Autopilot

Amazons AutoML. Requires more technical skills than the other big cloud suppliers and is quite limited and supports only two algorithms: XGBoost and Logistic regression. 

 
Deployment: To cloud and export

Types: Tabular

Explainable: Some

Monitor: Yes

Accessible: Requires coding

Labeling tool: Yes

General / Specialized: Generalized

Open Source: No

Includes transfer Learning: Yes

Link: https://aws.amazon.com/sagemaker/autopilot/

MLJar

 Deployment: Export and Cloud

Types: Tabular

Explainable: Yes

Monitor: -

Accessible: Very

Labeling tool: No

General / Specialized: Generalized

Open Source: MLJar has both and Open source(https://github.com/mljar/mljar-supervised ) and closed source solution.

Includes transfer Learning: Yes

Link: https://mljar.com/

Autogluon

 Deployment: Export

Types: Text, Images, tabular

Explainable: -

Monitor: -

Accessible: Requires coding

Labeling tool: No

General / Specialized: Generalized

Open Source: Yes

Includes transfer Learning: Yes

Link: https://autogluon.mxnet.io/

JadBio

 Deployment: Cloud and Export

Types: Tabular

Explainable: Some

Monitor: No

Accessible: Very

Labeling tool: No

General / Specialized: LifeScience

Open Source: No

Includes transfer Learning: -

Link: https://www.jadbio.com/

  

AUTOWEKA

This solution supports Bayesian models which is pretty cool.

 

Deployment : Export

Types: -

Explainable: -

Monitor: -

Accessible: Requires Code

Labeling tool: No

General / Specialized: Generalized

Open Source: Yes

Includes transfer Learning:No

Link: https://www.cs.ubc.ca/labs/beta/Projects/autoweka/

 

H2o Driverless AI 

Also supports bayesian models

Deployment: Export

Types: -

Explainable: -

Monitor: -

Accessible: Semi

Labeling tool: No

General / Specialized: Generalized

Open Source: Both options

Includes transfer Learning: -

Link: https://www.h2o.ai/

 

Autokeras

Autokeras is one of the most popular open source solutions and is definitely worth trying out.

Deployment: Export

Types: Text, Images, tabular

Explainable: Possible

Monitor: -

Accessible: Requires Code

Labeling tool: No

General / Specialized: Generalized

Open Source: Yes

Includes transfer Learning: -

Link: https://autokeras.com/

 

TPOT

 Deployment: Export

Types: Images and Tabular

Explainable: Possible

Monitor: -

Accessible: Requires Code

Labeling tool: No

General / Specialized: Generalized

Open Source: Yes

Includes transfer Learning: -

Link: http://epistasislab.github.io/tpot/

 

Pycaret

Deployment: Export

Types: Text, Tabular

Explainable: Possible

Monitor: -

Accessible: Requires Code

Labeling tool: No

General / Specialized: Generalized

Open Source: Yes

Includes transfer Learning: -

Link: https://github.com/pycaret/pycaret

AutoSklearn

Deployment: Export

Types: Tabular

Explainable: Possible

Monitor: -

Accessible: Requires Code

Labeling tool: No

General / Specialized: Generalized

Open Source: Yes

Includes transfer Learning: -

Link: https://automl.github.io/auto-sklearn/master/

TransmogrifAI

Made by Salesforce.

Deployment: Export

Types: Text and Tabular

Explainable: Possible

Monitor: -

Accessible: Requires Code

Labeling tool: No

General / Specialized: Generalized

Open Source: Yes

Includes transfer Learning: -

Link: https://transmogrif.ai/

 

6 things you should know before beginning with AI projects

Artificial intelligence(AI) projects are becoming commonplace for big business and entrepreneurs alike. As a result many people with no prior experience with AI are now being put in charge of AI projects. Almost 5 years ago that happened to me for the first time and I’ve since learned a lot. So here’s six things I wish I had known, when I did my first AI project.

1. Data is the most expensive part

AI is often talked about as being technically very difficult requiring extensive resources to develop. But in fact that’s not the complete truth. The development can be costly but the vast majority of the work and resources needed is usually in acquiring, cleaning and preparing data for the development to take place. 

Data is also the most crucial element when trying to make the AI successfully do its job. As a result you should always prefer superior data over superior technology when making AI models.

So when budgeting for an AI project make sure that you set a side most of the time and money for getting a lot of good quality data. And remember that you might even need to acquire fresh data continuously if the domain you work in has changing conditions.

2. AI technology is more accessible than you think

In a very short time AI have made the jump from requiring specialist data scientists and machine learning engineers to where we can now make AI models without a single line of code. A multitude of AutoML(Automatic machine learning) vendors have appeared in the later years and they are rapidly improving. That means that getting started on AI doesn’t require as much investment as before. 

The data acquisition and the human processes, like training and onboarding, still requires hard work though and neither should be underestimated.

3. AI is experimental 

Developing AI is an experimental process. You cannot know how long it will take to develop what you have in mind or how good it will be. In some cases you cannot even be sure that AI is a feasible solution to your problem before trying. 

The best way to succeed with uncertain project conditions like this, is to time cap and milestone fund the project. Set short milestones and only release more funds for a project if the goals for each milestone has been met or at least that you see meaningful progress. If you fund the whole project up front you might end up pouring all your money into a dead end that could have been caught early. 

4. Be clear on what the succes is to your project 

Before getting started you should be very clear with your stakeholders what a successful project will look like. New technology like AI can quickly be held to golden standards that it will never achieve. If expectations are not aligned before the kick off you might end up thinking you made a fantastic solution while some of your stakeholders are disappointed. In my experience the exact same AI solution can amaze some people and seem novel to others.

A good way to deal with this is to make all stakeholders agree that the first version of the AI should just be able to deliver the status quo. From there you can improve and gradually increase the value.

5. Users will lose sense of control

It can be hard to explain the inner workings and the reasoning behind an AI’s output. At the same time you cannot exactly know what output it will give, given a specific input. That will make it feel just as or even more unpredictable than humans doing the same tasks. As the users of an AI cannot ask questions or know if feedback given the AI will make a difference, the users will often feel a lost sense of control. 

To avoid that feeling you must first of all prepare the user of this new paradigm. It’s much easier if they buy in on these conditions before they get to try the AI. If possible you can also provide feedback mechanisms so the users will at least feel that they can make a difference. Not that it will work every time but it’s better than nothing. 

It’s also a good idea to manage expectations through the right narrative. Make it clear if the AI is a decision system, that makes decisions on it’s own or a support system that is just suggesting. By clearly understanding the purpose of the AI the users usually get more comfortable with it quicker.

6. People have very different understandings of what AI is 

As a rule of thumb everyone has a different understanding of AI. The managers, users, developers and all other stakeholders will have their unique understanding of what AI actually is. That’s very fair since AI does not have one definite definition, but it will be a source of problems if everyone involved in a project has a different understanding of what is going on. So before you start a project, make sure not to take for granted that anybody thinks the way you do. Be explicit about what AI means to you and how you will approach it.

Top 5 ways to make better AI with less data

1. Transfer Learning

Transfer learning is used a lot in machine learning now since the benefits are big. The general idea is simple. You train a big neural network for purposes with a lot of data and a lot of training. When you then have a specific problem you sort of “cut the end off” the big network and train a few new layers with your own data. The big network already understands a lot of general patterns that you with transfer learning don’t have to teach the network this time. 

A good example is if you try to train a network to recognize images of different dog species. Without transfer learning you need a lot of data, maybe a 100.000 images of different dog species since the network has to learn everything from scratch. If you train a new model with transfer learning you might only need 50 images of every species.

You can read more about Transfer Learning here.

2. Active learning

Active learning is a data collection strategy that enables you to pick the data that your AI models would benefit the most from when training. Let’s stick with the dog species example. You have trained a model that can differentiate between different species but for some reason the model always has trouble identifying the german shepherds. With an active learning strategy you would automatically or at least with an established process pick out these images and send them for labelling. 

I made a longer post about how active learning works here.

3. Better data

I’ve put in a strategy here that might sound obvious but is sometimes overlooked. With better quality data you often need way less data since the AI does not have to train through the same amount of noise and wrong signals. In the media AI is often talked about as “with a lot of data you can do anything”. But in many cases making an extra effort to get rid of bad data and make sure that only correctly labeled data is used for training, makes more sense than going for more data.

4. GAN’s

GAN’s or Generative Adversarial Networks is a way to build neural networks that sounds almost futuristic in it’s design. Basically this kind of neural network is built by having two networks compete against each other in a game where one network creates new fake training data examples from the data set and the other is trying to guess what is fake and what is real data. The network building fake data is called the generator and the network trying to guess what is fake and what is real is called the discriminator. This is a deep learning approach and both networks keep improving during the game. When the generator is so good at generating fake data that the discriminator is consistently having problems separating fake from real we have a finished model. 

For GAN’s you still need a lot of data but you don’t need as labelled data and since it’s usually the labelling that is the costly part you can save time and on your data with this approach.

5. Probabilistic Programming

One of my very favorite technologies. Probabilistic programming has a lot of benefits and one of them is that you can often get away with using less data. The reason is simply that you build “priors” into your models. That means that you can code your domain knowledge into the model and let data take it from there. In more many other machine learning approaches everything has to be learned by the model from scratch no matter how obvious it is. 

A good example here is document data capture models. In many cases the data we are looking for is obvious by the keyword to the left of it. Like “ID number: #number# is a common format. With probabilistic programming you can tell the model before training that you expect the data to be to the right of the keyword. Many neural networks are taught from scratch requiring more data.

You can also read more about probabilistic programming here: https://www.danrose.ai/blog/63qy8s3vwq8p9mogsblddlti4yojon

What is artificial intelligence? – The pragmatic definition

There’s no definitive or academically correct definition of artificial intelligence. The common ones usually define AI as computer models that perform tasks in a human like manner and simulate intelligent behaviour. 

The truth is that for each expert on the field of AI you ask about the definition, you will get some new variation of the definition. In some cases you might even get a very religious schooling on how it’s only AI if the system contains deep learning models. Do you ask many startup founders if their systems contain AI even though it’s just a very simple regression algorithm like we have had in Excel for ages. In fact a VC fund recently found that less than half of startups claiming to be AI-startups actually have any kind of AI.

So when is something AI? My definition is pretty pragmatic. For me AI has a set of common features that if present then I would define it as AI. Especially when working with AI these features will be very much in your face. In other words, if it looks like a duck, and it swims like a duck, and quacks like a duck, then it probably is a duck. The same goes for AI.

The common features I see:

- The system simulates a human act or task.

- The learning comes from examples instead of instructions. The entails the need for data.

- Before the system is ready you cannot predict that input X will give you exactly output Y. Only just about.

So to me it’s not really important what underlying algorithm does the magic. If the common features are present then it will feel the same to use and to develop and the challenges you will face, will be the same. 

It might seem like I’m with this definition taking AI to a simpler and less impactful place but this is just a tiny scratch in the surface. The underlying human, technical and organizational challenges are enormous when working with AI.

Model-assisted labelling – For better or for worse?

Collecting data is for many AI projects without a doubt the most expensive part of the project. Labelling data like images and text pieces is hard and tedious work without much possibility of scaling. If an AI project requires continuously updated or fresh data then this can be a high cost that can challenge the whole business case of an otherwise great project.

There are a few strategies though to lower the costs of labelling data. I have previously written about Active Learning; a data collection strategy that focuses on prioritizing the labelling of the most crucial data first given the models weakest confidence. This is a great strategy but in most cases you still need to label a lot of data. 

To speed up the labelling process the strategy of model-assisted labelling has come up. The idea is simply that you train an AI in parallel with labelling and as the AI starts to see a pattern in the data, the AI will suggest labels to the labeller. In that way the labeller in many cases can simply approve the pre suggested label. 

Model-assisted labelling can be done both by training a model solely for the purpose of labelling but can also be done by putting the actual production model in the labelling loop and letting that suggest labels.

But is modelassisted labelling just a sure way to get data labelled quicker? Or are there downsides to the strategy? I have worked intensively with model-assisted labelling and I know for sure that there are both pros and cons and if you’re not careful you can end up doing more harm than good with this strategy. If you manage it correctly it can work wonders and save you a ton of resources.

So let’s have a look at the pros and cons.

The Pros

The first and foremost advantage is that it’s faster for the person working with labelling to work with pre-labelled data. Approving the label with a single click for most cases and only having to manually select a label once in a while is just way faster. Especially when working with large documents or models with many potential labels the speed can increase significantly.

Another really useful benefit with model-assisted labelling is that you very early on get an idea about the models weak points. You will get a hands-on understanding of what instances are difficult for the model to understand and usually mislabels. This reflects on the results you should expect in production and as a result youtube the chance early to improve or work around these weak points. When seeing weak points in the model that also often suggests a lack of data volume or quality in these areas. So it also provides an insight to what kind of data you should go look for to be labelled more of.

The cons

Now for the cons. As I mentioned the cons can be pretty bad. The biggest issue with model-assisted labelling is that you are running the risk of lowering the quality of your data. So even though you get more data labelled faster with less quality you can end up with a model performing worse than it would had you not used model-assisted labelling. 

So how can model-assisted labelling lower the data quality? It’s actually very simple. Humans tend to prefer defaults. The second you slip into autopilot you will start making mistakes by being more likely to choose the default or suggested label. I have seen this time and time again. The biggest source of mistakes in labelling tend to be accepting wrong suggestions. So you have to be very careful when suggesting labels.

Another downside can be if the pre-labelling quality is simply so low that it takes the labeller more time to correct than it would have to start with a blank answer. So you will have to be careful to not enable the pre-labelling too early.

A few tips for model-assisted labelling

I have a few tips for being more successful with model-assisted labelling.

First tip is to set a target for data quality. You will never get 100% correct data anyway so you will have to accept some number of wrong labels. If you can set a target that is acceptable to train the model from, you can monitor if the model-assisted labelling is begging to do more harm than good. That also works great as an expectations alignment on your team in general.

I’d also suggest doing samples without pre-labelling to measure if there’s a difference between the results you get with and without pre-labelling. You simply do this by turning off the assist model for an example one out of every ten cases. It’s easy and will show a lot of truth.

Lastly I will suggest one of my favorites. Probabilistic programming models are very beneficial for model-assisted labelling. Probabilistic models are Bayesian and as a result offer uncertainty in distributions instead of scalars(a number) and make it much easier to know if the pre-label is likely to be correct or not.  

What is data operations (DataOps)?

When I write about AI I very often refer to data operations and how important a foundation it is for most AI solutions. Without proper data operations you can easily get to a point where handling the necessary data will be too difficult and costly for the AI business case to make sense. So to clarify a little I wanted to give you some insight on what it really means.

Data operations is the process of obtaining, cleaning, storing and delivering data in a secure and cost effective manner. It’s a mix of business strategy, DevOps and data science and is the underlying supply chain for many big data and AI solutions. 

Data operations was originally coined in Big Data regi but has become a more broadly used term in the later years.

Data operations is the most important competitive advantage

As I have mentioned in a lot of previous posts, I see the data operations as a higher priority than algorithm development when it comes to trying to beat the competition. In most AI cases the algorithms used are standard AI algorithms from standard frameworks that are fed data, trained and tuned a little before being deployed. So since the underlying algorithms are largely the same the real difference is in the data. The work that goes into to get good results from high quality data is almost nothing compared to the amount of work it takes when using mediocre data. Getting data at a lower cost than the competition is also a really important factor. Especially in AI cases that require a continuous flow of new data. In these cases getting new data all the time can become an economic burden that will weigh down the business.

Data operations Paperflow example

To make it more concrete I wanted to use the AI company I co-founded Paperflow as an example. Paperflow is an AI company that receives invoices and other financial documents and captures data such as invoice date, amounts and invoice lines. Since invoices can look very different and the layout of invoices changes over time, getting a lot and getting more data all the time is necessary. So to make Paperflow a good business we needed good data operations. 

To be honest we weren't that aware of the importance when we made these initial decisions but luckily we got it right. Our first major decision in the data operations was that we wanted to collect all data in-house and make our own system for collecting data. That’s a costly investment with both a high investment into the initial system development but also a high recurring cost to our employees with the job of entering data from invoices into the system. The competition had chosen another strategy. They instead had the customers enter the invoice data to their system when their AI failed to make the right prediction on the captured data. That’s a much cheaper strategy that can provide you with a lot of data. The only problem is that customers only have one thing in mind and that is to solve their own problems disregarding if it is correct or not in terms of what you need for training data. 

So in Paperflow we found a way to get better data. But how do you get the costs down then?

A part of the solution was heavily investing in the system that was used for entering data and trying to make it as fast to use as possible. It was really trial and error and it took a lot of work. Without having the actual numbers I guess we invested more in the actual data operating systems than the AI. 

Another part of the solution was to make sure we only collected the data we actually needed. This is a common challenge in data operations since it’s very difficult to know what data you are going to need in the future. Our solution was to first go for collecting a lot of data (and too much) and then slowly narrowing down the amount of data collected. Going the other way around can be difficult. If we had suddenly started to collect more data on each invoice we would basically have needed to start over and discard all previously validated invoices. 

We also started to work a lot on understanding a very important metric. When were our AI guesses so correct that we trust it and avoid to validate a part of the data. That was achieved with a variety of different tricks and technologies one of them being probabilistic programming. Probabilistic programming has the advantage of delivering a uncertainty distribution instead of a percentage that most machine learning algorithms will do. By knowing how sure you are that you are such significantly lowers the risks of making mistakes.

The strategy of only collecting data that you need the most by choosing cases where you AI is the most uncertain is also known as active learning. If you are working on your data operations for AI, you should definitely look into that.

DevOps data operation challenges

On the more tech-heavy part of storing data in an effective way you will also see challenges. I’m not a DevOps expert but I have seen the problem of suddenly having too much data that grows faster than expected in real life. That can be crucial since the scaling ability quickly is coming under pressure. If I could provide one advice here it would be to involve a DevOps early on in the architecture work. Building on a scalable foundation is much more fun than trying to find short term solutions all the time.

Page 1 of 2
1 2