13
Apex-Goliath National AI & Data Science Research Infrastructure

Apex-Goliath - NSTDA

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Apex-Goliath - NSTDA

Apex-GoliathNational AI & Data Science Research Infrastructure

Page 2: Apex-Goliath - NSTDA

What is Apex-Goliath?

Apex-Goliath is a federated AI experimental research and development infrastructure for Thai universities, researchers, students, and startups. It will provide services related to AI research including data storage and computer clusters for AI, High-Performance Computing (HPC), and Cloud services.

Page 3: Apex-Goliath - NSTDA

Apex-Goliath Services

● Federated computing platform with regional nodes○ For university, student, and startup research projects

● Experimental hybrid AI/HPC/Cloud infrastructure○ AI model experiments and training○ Data lake infrastructure and sharing platform○ Inference edges & services

● Commodity + web-scale architecture meets HPC● Resource sharing + pilot services

3

UniversitiesResearch Centers

Startups

Interactive/Batch Experiments& Regional Data Collectionon federated AI R&D infrastructure

Page 4: Apex-Goliath - NSTDA

Related Platforms

4

Apex — AI-Focused TasksAI, Deep Learning, Inference,ASR, Video/Image processing

For universities, research centers, consortiums, startups

ThaiSC — HPC-Focused TasksSimulation, Genomes, Aerospace, Military

For national research institutes & government research

GDCC — MoDEGovernment data, applications, & services

For government services & public organizations

Scale Out

Rollout successful pilot to GDCC or Private Cloud

High-compute Tasksbatch to ThaiSC

Scale Up

Mixed data & compute-intensive workload

Mixed

Page 5: Apex-Goliath - NSTDA

Target University & IndustryResearch Domains

● AI for Health & Medical Security● AI for Food & Agriculture● AI for Local Tourism & Economy● AI for Entertainment & Creative Technology● AI for Finance / Logistics / Businesses● Etc.

5

Page 6: Apex-Goliath - NSTDA

ApexAI Research Platform & Data Exchange Infrastructure

Hybrid AI/HPC/Cloud infrastructure

Apex & Goliath Platform Structure

6

Goliath AI & Data Analytics Platform

Data journalism / sharable research datasets / pre-trained models & notebooks / infrastructure accessCommon

Schema for

Research Datasets

Applications● Disease control,

Social distancing● Travel, Retail, &

Tourism● Food Safety● Etc.

Data

Precision Agriculture/Aquaculture

IoT & Smart Buildings/

Energy

Health & Wearables

GIS/ Logistics/

Retail Map

Common Thai-context Corpus

(Tagged Speech/ Image/Texts)

Page 7: Apex-Goliath - NSTDA

Apex AI Research Infrastructure

● R&D Infrastructure○ Hardware Infrastructure:

High Performance Computing & Storage System Prototype

○ 1,536 logical CPU cores○ 30 petaFLOPS AI (48 x A100 gpus)○ 3 petaBytes storage○ 200 Gbps HDR Interconnect

● Funding Agency: PMU-C (BCG Digital) / NXPO / MHESI

7

Page 8: Apex-Goliath - NSTDA

8

ApexSystem Architecture

Page 9: Apex-Goliath - NSTDA

GoliathData Exchange Platform

Goliath allows AI researchers and engineers to share datasets with other researchers, helping them find the datasets they need as well as promoting the use of previously siloed, underutilized—but valuable—datasets.

Powered by Apex, Goliath also let users tap into Apex’s high-performance computing power for their AI model training needs.

9

Page 10: Apex-Goliath - NSTDA

Goliath Features

10

Share Open Datasets With the Community

High-Performance Computing With Less Data Transfer

Control Who Has Access to Your Dataset

Use existing datasets on Goliath and connect them to their Apex-powered AI projects without having to upload your own copies of the dataset for every projects.

With multiple levels of permission types on Goliath, you can make your datasets publicly available, available upon request, or only share them with specific people.

Goliath provides a platform for individuals and organizations for creating open datasets, giving more value to the previously siloed, underutilized datasets.

Page 11: Apex-Goliath - NSTDA

Common Open Datasets to Become Available on Apex & Goliath

11

● COCO (Common Objects in Context)● Mozilla Common Voice● Thai local dialect● WordNet● ImageNet● and more...

Page 12: Apex-Goliath - NSTDA

Common Schemafor Research Datasets

12

To make the most out of a diverse set of datasets from multiple sources, we also propose “CMKL Common Dataset Schema Standard version 1.0,” to make merging different datasets easier for researchers.

Page 13: Apex-Goliath - NSTDA

AI Research Global Partnership

● Technology providers Nvidia, Data Direct Network● Academic Institutions Carnegie Mellon University

For more information: https://www.cmkl.ac.th/apex

13