A major AI training data set contains millions of examples of personal data
Büyük bir yapay zeka eğitim veri seti milyonlarca kişisel veriler içeriyor. Pasaport, kredi kartı ve kimlik belgeleri gibi hassas bilgiler AI modellerini eğitmek için kullanılan açık kaynak veri setinde bulunmuş.
A significant open-source dataset used for training artificial intelligence models has been found to contain millions of images of sensitive personal documents. Researchers discovered that this dataset, known as DataComp CommonPool, likely includes numerous examples of passports, credit cards, and birth certificates. A preliminary examination of a small portion of the dataset revealed thousands of images featuring identifiable faces and personal information.
The presence of such personal data in widely used AI training sets raises serious privacy concerns and highlights the need for better data curation and security in AI development.
📌 Kaynak
Bu özet MIT kaynağından otomatik derlenmiştir. Tamamı için orijinal habere gidin.
Orijinal haberi oku →