...

Big Data

by user

on
Category: Documents
6

views

Report

Comments

Description

Transcript

Big Data
e-infrastructure of DIAS
Toshihiro Nemoto1
Masaru Kitsuregawa 2
1
Earth Observation Data Integration and Fusion Initiative, The Univ. of Tokyo
2
Director General of National Institute of Informatics(NII)
Professor, Institute of Industrial Science, The Univ. of Tokyo
President of Information Processing Society (IPSJ )
IEEE Fellow ACM Fellow
DIAS
• Data Integration and Analysis System
– For providing access to global and regional sensing data
– information storage infrastructure for public benefit applications
and the deepening of scientific knowledge in the areas of
• Climate
• Water cycle
• …
– for application in
•
•
•
•
Fisheries
Agriculture
Biodiversity
…
• e-platform is ‘hidden’ but an important enabler.
Global Earth Observation System of Systems
(GEOSS)
Land Observation
Ocean Observation
Satellite Imagery
WCRP CMIP3
Simulations
Data Integration & Information Fusion Platform
App. Layer
User Apps.
User Apps.
User Apps.
User Apps.
User Apps.
Common Utility Layer
•Visualizer(w display wall)
•Discovery Work Flow Assist
•Data Quality Manager
•Data Transformer
•Data Crawler
•ETL
•Data Mingrator
•Data Navigator
•Meta Data Manger
Data Management Layer •Database management system
File System Layer
Storage Layer
Disk Arrays
•PB scale logical file
•Storage management
•Power management
MetBroker
NOAA Antenna
(Antenna installed at Roppongi 1980, Operation started in
1981, Stationary service in 1983)
Late Prof.
M. Takagi
Hand made Receiving Station
(bit synchronizer,
frame synchronizer))81-
Analog Data Recoder 82
Mainframe Machine
(FACOM M160/170)
Mass Storage
8mm tape archive 1992-
STK 9310 (Powder Horn)
High End Storage(6000 tapes)2001-
DIAS Today
disk + tape > 20PB
DIAS is(has been) on-premises Cloud
DIAS/GRENE System Structure
Servers (Cluster)
•8 nodes
•CPU 16 cores/node
•Memory 48GB/node
Disk Array
•~1.4PB
Hokkaido University
Servers (Cluster)
•8 nodes
•CPU 16or12cores/node
•Memory 48GB/node
Disk Array
•~0.7PB
Server
Server
•CPU 64 cores
•CPU 32 cores
•Memory 1024GB
•Memory 512GB
Servers (Cluster)
Disk Array
•64 nodes
•~5.2PB
•CPU 16 cores/node Tape Library
•Memory 48GB/node •~6.2PB
Institute of Industrial Science,
The university of Tokyo
National Institute of Informatics
Chiba Annex
Kitami Institute of Technology
Server
•CPU 80 cores
•Memory 2048GB
Servers (Cluster)
•60 nodes
•CPU 20 cores/node
•Memory 64GB/node
•with HPC coprocessor
Disk Array
•~11.6PB
Server-Storage Coupled System
サーバ
~2006
2011
2010
2009
2008
2007
2012
64ノード
16core
48GB
16core
48GB
16core
48GB
・・・
16core
48GB
16core
48GB
16core
48GB
10ギガビットイーサネット
1ギガビットイーサネット
1.67GHz 8core
128GB
1.67GHz 8core
128GB
2.26GHz 64core
1024GB
5GHz 32core
512GB
FCスイッチ
FCスイッチ
~160TB
~0.6PB
~0.8PB
~0.8PB
~2PB
ディスクアレイ
~2PB
~3PB
テープライブラリ
Big Data in Earth Env. Research
CMIP5
GCM20
ERA Interim
JP10
K-1
FieldServer
CMIP3
JRA-25
GPV
DIAS Satellite
CEOP Model
CEOP Satellite
AMSR-E
MODIS Mongolia
MODIS AIT
MODIS NASA
MODIS UT
MTSAT
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
GMS
Solutions in DIAS
4Vs for Bigdata
Volume
Variety
Velocity
Veracity
Volume
Variety
So many kinds of Data on DIAS
Velocity
(stock and stream)
Global Data to Local Information
General Circulation Model
Data
Assimilation
Data
Assimilation
Improved prediction
Satellite data
Regional/Meso Model
Improved
Initial
Condition
Proactive
Control of DAM
In-situ data
Centralized Data System
Flood Peak Reduction
3000
Optimized rules
2500
Outflow eq. inflow
3
discharge [m /s]
Outflow eq. 0
2000
1500
1000
500
DHM
0
7/8.1z
7/9.1z
7/10.1z
7/11.1z
2002
7/12.1z
Socio-Economic Data
Improved
Prediction
Upper Tone River Basin
Yagisawa
Naramata
Fujiwara
Shimagawa
Aimata
Sonohara
Yamba
Flood reduction by Proactive Control of Dam
Discharge with GPV 13~18
900
Sonohara dam
570
560
550
700
540
3
800
Iwamoto gauge
3000
Flood peak
reduction
Optimized release
rules
Optimized
2000
500
400
300
520
510
500
490
100
480
470
7/9.1z
7/10.1z
7/11.1z
7/12.1z
2002
1500
Water level increase due to storage
1000
500
7/10.1z
7/11.1z
7/12.1z
3
7/9.1z
discharge [m /s]
0
7/8.1z
530
200
0
7/8.1z
Outflow eq. inflow
3
discharge [m /s]
Outflow eq. 0
Sim outflow
Sim inflow
Sim water level
Obs water level
2002
Peak created due to
water release from dams
1000
900
800
700
600
500
400
300
200
100
0
Fujiwara dam
7/8.1z
Sim outflow
Sim inflow
Sim water level
Obs water level
7/9.1z
7/10.1z
7/11.1z
2002
7/12.1z
650
645
640
635
630
625
620
615
610
605
600
waterlevel [m]
2500
600
waterlevel [m]
1000
discharge [m /s]
Water is stored until
max capacity is reached
DIAS Ensemble Flood Prediction
T12, 2011, Aimata
T15, 2011, Murakami
legend
Observed
Ensemble Prediction
T12, 2011, Maebashi
T15, 2011, Maebashi
DIAS Ensemble Flood Prediction
Data Integration and Analysis System (DIAS)
Meteorological Data
Temp.
wind
rain
sunshine
Pre-processing
Model input data
River, Dam
Radar, NWP
Radar
(C-band)
NWP Output
(GPV)
Runoff Water Level
Dam: WL, inflow, release
Real Time Data
Management System
Prediction System
Pre-processing
Radar, GPV
Ensemble Rainfall Prediction
Rainfall
Pattern 12
Rainfall
RainfallPattern
Pattern N
WEB-DHM
Ensemble Flood Prediction
River Runoff
Soil Moisture
Statistical Analysis
Error
Estimation
Flood
Pattern 12
Flood
FloodPattern
Pattern N
Dam Operation Simulator
(Human Operation)
Veracity
Data Integration and Analysis System
a legacy for Japan's contributions to GEOSS
accelerating data archiving, including data
loading, QC and metadata registration
Data Integration and Analysis System
a legacy for Japan's contributions to GEOSS
accelerating data archiving, including data
loading, QC and metadata registration
CMIP5 Apps
3D Powered Visualizer
zonal
asymmetric
anomaly of θ
= θ at given
grid
– global mean
of θ at given
latitude
デモ
Climatology by
NCEP/NCAR
reanalysis data
From late April to late May, there are two warm anomalies developing around the Tibetan Plateau.
One is just above the Tibetan Plateau developing upward.
The other is over the southern slope of the Tibetan Plateau developing downward from the
tropopause.
The former can be explained by the land surface heating of the Tibetan Plateau, but what causes
the upper-level warming over the southern slope of the Tibetan Plateau?
3D Vector Visualizer
インド洋
南
チベット高原
太平洋
北
西
東
デモ
Adiabatic warming is found to be associated with upper-anticyclone derived from tropical
convective heating. In other words, Matsuno-Gill type atmospheric response to convective
heating causes the upper-level warming around the Tibetan Plateau.
Voxel Visualization
台風発生シーズンにおける台湾上空の降雨量の時間変化をユーザの直接操作により探索
TRMM PR 降雨強度データ
青:強度最少
|
赤:強度最大
切断面
データフィルタ
透過度設定
ドットサイズ設定
2D Correlation Analyser
Conclusion
DIAS ‘IS’ a big data platform
for earth environment data including
observation data and SC model outputs.
Thanks
Fly UP