How AI is driving a rethink of storage architecture


A common refrain in the technological industry is that artificial intelligence (AI) needs rapid storage. Although true, speed is not the only consideration when taking charge of the growing number of workloads of the AI, according to Botes, vice-president of the IA infrastructure at Pure Storage.

The real challenge, he argues, is to rethink storage for a world of Autonomous AI agentswhere data confidence and lineage is important. This requires a repensation in storage architecture, going beyond the performance to build new audit capacities directly in the storage layer.

“No storage system has been designed this way,” said Botes to Computer Weekly in a recent interview. Current systems treat files and objects as mutable elements to read and write to. This is not enough for Botes calling “systems of consequences”, in which AI interactions have a real impact and must be verifiable for compliance and quality insurance purposes.

To remedy this, Botes pleads for storage systems that record each interaction that an AI agent has with the data. “The second where you are moving away from simple chatbots in consequence systems, where data has an impact, you would like to have a listener,” he said. “Each interaction involving agents must be registered for audit reasons, and you would like a copy of the conversation.”

PUR storage seeks to create features and lines of complete data line directly in its products, each update creating a new version of a data object rather than crushing the previous one. “For an object that you have written six times, you would like to be able to recover the copy that was four versions ago,” said Botes. “You can see the whole line and what was used by which IA is questioned at what time.”

This capacity, although it is not yet ready, is essential to promote trust -to -trust AI. “The data version is just as important as the version of the model to create the result,” said Botes. “So you would like to have this complete line and this story.”

Although the new storage architecture is more than speed, the performance remains essential because the workloads in AI require high performance per unit of capacity. In addition, AI algorithms can scan wide data sets – including archived data in long -term storage – at high speed, effectively blurring the distinction between hot and cold data.

This could very well accelerate the transition to the All-Flash data center. “I think that in the AI ​​world, Flash is super difficult not to use because you do not have cold datasets as you did, because AI is watching all the data, all the time,” said Botes.

However, flash memory has a finished lifespan. Each memory cell degrades each time it undergoes a program of program-erase. Over time, it makes the cell slower until it is finally marked as a failure. Traditionally, small low -power controllers on each storage device use basic statistical models to manage wear, by applying wide -blow adjustments to voltages and schedules on thousands of cells at a time.

Botes and his team, however, have developed sophisticated AI models that can manage the flash media at a granular level per cell. By analyzing the precise response time and the behavior of each unique cell during its lifespan, the models can make micro-adjustments that improve performance and sustainability beyond what traditional controllers can achieve.

The data used to train and improve models come from telemetry data from pure storage devices deployed in the field. “Each customer brings us telemetry every 20 seconds, so we have billions of data points every day on how the equipment and chips are used,” he said.

In fact, the ideas are so precious that pure storage allow the return knowledge to the manufacturers of memory chips themselves, helping them to design better silicon according to real use rather than laboratory hypotheses. “We get this virtual cycle where we learn more about fleas and reunions that manufacturers to improve fleas,” said Botes.

Meanwhile, Pure has also helped organizations to manage the complexity of AI data. It has developed a distributed control plan designed to make an entire storage imprint of a company, both on site and in the cloud, works as a single unified system, allowing administrators to define service levels, performance characteristics and protection policies for different data classes.

“It makes many small storage components look like a large singular storage plan,” said Botes. “You stop worrying about how it takes place and simply focus on the results you want.”

Leave a Reply

Your email address will not be published. Required fields are marked *