Wikiodia Foundation Wikipedia Is, to deal with Artificial intelligence robots Which is constantly extracting this platform’s information, has published a set of data specifically designed to teach artificial intelligence models.
Wikipedia has announced in collaboration with the platform Kaggn (Well owned by Google and hosts machine learning data), has released a beta version of a data set that includes the structured Wikipedia content in English and French.
Wikipedia Data Collection Help to develop artificial intelligence developers
According to Wikipedia, this dataset considers the needs of Developers Designed and accessible to machine -readable information for training, precision adjustment, evaluation, matching and analysis of artificial intelligence models easier.
These data have been released with free license and include research summaries, short descriptions, images links, infox data and segmentation of articles, but there are no references and non -written files such as audio files.
The Wikipedia Foundation says in a statement that these data, presented in the form of JSON files, could be a better alternative to direct extraction and analysis of raw text articles. Data extraction by robots is currently putting a lot of pressure on Wikipedia servers, as these artificial intelligence robots widely use its bandwidth.
Previously, Wikipedia had signed content sharing contracts with companies such as Google and Internet Archive, but collaborating with Kagget could make Wikipedia data more accessible to smaller companies and independent researchers.
Bernanda Flynn, director of Kaggn’s collaboration, said of the collaboration:
“We are very excited to host the Wikipedia Foundation data. “Kaggle will proudly play a role in maintaining access, productivity and usefulness.”
RCO NEWS



