Firm foundations are vital for large-scale AI-enabled projects
Image credit: Pavlo I/Dreamstime
Successful organisations have learned that systems need to be carefully constructed to provide the feedback which can keep initiatives moving forward.
The clamour of anticipation around new applications for artificial intelligence is as fevered as ever. The problem for me is that expectations are not informed by a robust appreciation of the practical requirements for innovating with AI. As an adviser to businesses on bringing such innovation to market, my advice is simple: to scale rapidly, large-scale AI-enabled projects must be built on firm foundations to allow multidisciplinary development teams to thrive.
Chief among the reasons is that, in engineering terms, developing AI is a complex, non-linear process. Frankly, you can expend a great deal of time and effort with very little progress to show for it. It is the antithesis of agile approaches that deliver incremental advances. Even if you do make solid progress, rapid acceleration at scale will never be guaranteed if the foundations are not fit for purpose. Pushing ahead without a good level of confidence in robust groundwork is risky indeed.
To avoid this risk, three broad foundations need to be set in stone: strategic alignment; data-driven organisation and processes; and a framework for experimenting and iterating in the wild. It is also vital to impose strong organisation and infrastructure when innovating with AI. To get both pillars in place and succeed with an ambitious vision, it is vital to forge strategic alignment amongst end users and key stakeholders as early as possible.
It’s easy to align on the possibilities of AI, but agreeing on the specific use cases and associated engineering complexity and investment may be harder. It’s worthwhile getting this in place early and maintaining a dialogue as ambitions evolve. Capturing this in a multidisciplinary AI strategy from the outset that is updated as the project matures will pay dividends.
If data science and AI teams are going to be able to focus on innovation, we need to organise around them and reduce the friction that can hinder experimentation. We see this through two lenses – model management (which maps quite well to traditional software DevOps practices) and data management (which is newer to some organisations). It seems straightforward, but even reproducing results can be challenging without the right processes in place – as beautifully described by Pete Warden in his blog of a few years ago. Getting this right turns data collection and management into a source of competitive advantage.
Looking at what works well in our own development work at Cambridge Consultants and amongst our top-performing clients, three principles emerge. The first is to encourage data scientists to drive requirements and engage with other disciplines. Data requirement, management and profiling are key initial steps to help inform the design and development of AI technology. Secondly, it’s vital to start AI engineering as close to the data source as possible and build the right data pipeline/architecture. Finally, collect relevant data first and avoid doing that in parallel with model development, unless there is good reason. This has implications for R&D timelines and device design.
Ensuring that data scientists are effective demands significant effort and investment. Larger projects may benefit from dedicated data engineers to free up data scientists for experimenting with data. These principles are also important for scaling – capturing the right data, processing it effectively and without overinflated communications and compute costs is important to growth.
This applies not just to growth in numbers of users/devices but also reach – when expanding into different contexts re-training is required. If you train an autonomous car in the US, you will need to do some level of retraining for other geographies. It’s unlikely to be ideal to simply redeploy the same model. So that’s yet more data that needs to flow through the pipeline.
Your rate of innovation flows directly from the speed and quality of each iteration of experimentation you can run through. This means that in addition to the data infrastructure outlined above, you will need an MLOps/ModelOps infrastructure and processes. This is the engineering framework that allows you to create and deploy AI technology in the wild.
This is where the best practices of modern software development need to come together to enable design, evaluation and deployment of models. The processes and infrastructure required should seek to accelerate the learning cycle which moves product and service development on. How this plays out in the form of infrastructure and processes strongly depends on the use case – for example, applications with cloud, on-premise distributed compute, or some hybrid will each generate different needs.
Increasingly, tools are becoming available to form part of the overall solution, such as Modzy and Nubix. Regardless of what is used, top-performing organisations ensure that their systems are instrumented to provide useful measurements and hence feedback. And feedback is key to the learning which moves model design and deployment forward.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.