ALEXANDRIA, Va., March 3 -- United States Patent no. 12,566,682, issued on March 3, was assigned to Hewlett Packard Enterprise Development LP (Spring, Texas).
"Resilient fully sharded data parallel" was invented by Lianjie Cao (Milpitas, Calif.), Saeed Rashidi (Spring, Texas), Puneet Sharma (Milpitas, Calif.), Garrett Goon (Spring, Texas) and Paolo Faraboschi (Barcelona, Spain).
According to the abstract* released by the U.S. Patent & Trademark Office: "Systems and methods are provided for failure resiliency in distributed training of machine learning (ML) models. Examples include a plurality of compute nodes storing shards of a plurality of shards of model states of an ML model, and a first compute node storing a first shard of model sta...