Comments
Description
Transcript
Big Data
e-infrastructure of DIAS Toshihiro Nemoto1 Masaru Kitsuregawa 2 1 Earth Observation Data Integration and Fusion Initiative, The Univ. of Tokyo 2 Director General of National Institute of Informatics(NII) Professor, Institute of Industrial Science, The Univ. of Tokyo President of Information Processing Society (IPSJ ) IEEE Fellow ACM Fellow DIAS • Data Integration and Analysis System – For providing access to global and regional sensing data – information storage infrastructure for public benefit applications and the deepening of scientific knowledge in the areas of • Climate • Water cycle • … – for application in • • • • Fisheries Agriculture Biodiversity … • e-platform is ‘hidden’ but an important enabler. Global Earth Observation System of Systems (GEOSS) Land Observation Ocean Observation Satellite Imagery WCRP CMIP3 Simulations Data Integration & Information Fusion Platform App. Layer User Apps. User Apps. User Apps. User Apps. User Apps. Common Utility Layer •Visualizer(w display wall) •Discovery Work Flow Assist •Data Quality Manager •Data Transformer •Data Crawler •ETL •Data Mingrator •Data Navigator •Meta Data Manger Data Management Layer •Database management system File System Layer Storage Layer Disk Arrays •PB scale logical file •Storage management •Power management MetBroker NOAA Antenna (Antenna installed at Roppongi 1980, Operation started in 1981, Stationary service in 1983) Late Prof. M. Takagi Hand made Receiving Station (bit synchronizer, frame synchronizer))81- Analog Data Recoder 82 Mainframe Machine (FACOM M160/170) Mass Storage 8mm tape archive 1992- STK 9310 (Powder Horn) High End Storage(6000 tapes)2001- DIAS Today disk + tape > 20PB DIAS is(has been) on-premises Cloud DIAS/GRENE System Structure Servers (Cluster) •8 nodes •CPU 16 cores/node •Memory 48GB/node Disk Array •~1.4PB Hokkaido University Servers (Cluster) •8 nodes •CPU 16or12cores/node •Memory 48GB/node Disk Array •~0.7PB Server Server •CPU 64 cores •CPU 32 cores •Memory 1024GB •Memory 512GB Servers (Cluster) Disk Array •64 nodes •~5.2PB •CPU 16 cores/node Tape Library •Memory 48GB/node •~6.2PB Institute of Industrial Science, The university of Tokyo National Institute of Informatics Chiba Annex Kitami Institute of Technology Server •CPU 80 cores •Memory 2048GB Servers (Cluster) •60 nodes •CPU 20 cores/node •Memory 64GB/node •with HPC coprocessor Disk Array •~11.6PB Server-Storage Coupled System サーバ ~2006 2011 2010 2009 2008 2007 2012 64ノード 16core 48GB 16core 48GB 16core 48GB ・・・ 16core 48GB 16core 48GB 16core 48GB 10ギガビットイーサネット 1ギガビットイーサネット 1.67GHz 8core 128GB 1.67GHz 8core 128GB 2.26GHz 64core 1024GB 5GHz 32core 512GB FCスイッチ FCスイッチ ~160TB ~0.6PB ~0.8PB ~0.8PB ~2PB ディスクアレイ ~2PB ~3PB テープライブラリ Big Data in Earth Env. Research CMIP5 GCM20 ERA Interim JP10 K-1 FieldServer CMIP3 JRA-25 GPV DIAS Satellite CEOP Model CEOP Satellite AMSR-E MODIS Mongolia MODIS AIT MODIS NASA MODIS UT MTSAT 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991 1990 1989 1988 1987 1986 1985 1984 1983 GMS Solutions in DIAS 4Vs for Bigdata Volume Variety Velocity Veracity Volume Variety So many kinds of Data on DIAS Velocity (stock and stream) Global Data to Local Information General Circulation Model Data Assimilation Data Assimilation Improved prediction Satellite data Regional/Meso Model Improved Initial Condition Proactive Control of DAM In-situ data Centralized Data System Flood Peak Reduction 3000 Optimized rules 2500 Outflow eq. inflow 3 discharge [m /s] Outflow eq. 0 2000 1500 1000 500 DHM 0 7/8.1z 7/9.1z 7/10.1z 7/11.1z 2002 7/12.1z Socio-Economic Data Improved Prediction Upper Tone River Basin Yagisawa Naramata Fujiwara Shimagawa Aimata Sonohara Yamba Flood reduction by Proactive Control of Dam Discharge with GPV 13~18 900 Sonohara dam 570 560 550 700 540 3 800 Iwamoto gauge 3000 Flood peak reduction Optimized release rules Optimized 2000 500 400 300 520 510 500 490 100 480 470 7/9.1z 7/10.1z 7/11.1z 7/12.1z 2002 1500 Water level increase due to storage 1000 500 7/10.1z 7/11.1z 7/12.1z 3 7/9.1z discharge [m /s] 0 7/8.1z 530 200 0 7/8.1z Outflow eq. inflow 3 discharge [m /s] Outflow eq. 0 Sim outflow Sim inflow Sim water level Obs water level 2002 Peak created due to water release from dams 1000 900 800 700 600 500 400 300 200 100 0 Fujiwara dam 7/8.1z Sim outflow Sim inflow Sim water level Obs water level 7/9.1z 7/10.1z 7/11.1z 2002 7/12.1z 650 645 640 635 630 625 620 615 610 605 600 waterlevel [m] 2500 600 waterlevel [m] 1000 discharge [m /s] Water is stored until max capacity is reached DIAS Ensemble Flood Prediction T12, 2011, Aimata T15, 2011, Murakami legend Observed Ensemble Prediction T12, 2011, Maebashi T15, 2011, Maebashi DIAS Ensemble Flood Prediction Data Integration and Analysis System (DIAS) Meteorological Data Temp. wind rain sunshine Pre-processing Model input data River, Dam Radar, NWP Radar (C-band) NWP Output (GPV) Runoff Water Level Dam: WL, inflow, release Real Time Data Management System Prediction System Pre-processing Radar, GPV Ensemble Rainfall Prediction Rainfall Pattern 12 Rainfall RainfallPattern Pattern N WEB-DHM Ensemble Flood Prediction River Runoff Soil Moisture Statistical Analysis Error Estimation Flood Pattern 12 Flood FloodPattern Pattern N Dam Operation Simulator (Human Operation) Veracity Data Integration and Analysis System a legacy for Japan's contributions to GEOSS accelerating data archiving, including data loading, QC and metadata registration Data Integration and Analysis System a legacy for Japan's contributions to GEOSS accelerating data archiving, including data loading, QC and metadata registration CMIP5 Apps 3D Powered Visualizer zonal asymmetric anomaly of θ = θ at given grid – global mean of θ at given latitude デモ Climatology by NCEP/NCAR reanalysis data From late April to late May, there are two warm anomalies developing around the Tibetan Plateau. One is just above the Tibetan Plateau developing upward. The other is over the southern slope of the Tibetan Plateau developing downward from the tropopause. The former can be explained by the land surface heating of the Tibetan Plateau, but what causes the upper-level warming over the southern slope of the Tibetan Plateau? 3D Vector Visualizer インド洋 南 チベット高原 太平洋 北 西 東 デモ Adiabatic warming is found to be associated with upper-anticyclone derived from tropical convective heating. In other words, Matsuno-Gill type atmospheric response to convective heating causes the upper-level warming around the Tibetan Plateau. Voxel Visualization 台風発生シーズンにおける台湾上空の降雨量の時間変化をユーザの直接操作により探索 TRMM PR 降雨強度データ 青:強度最少 | 赤:強度最大 切断面 データフィルタ 透過度設定 ドットサイズ設定 2D Correlation Analyser Conclusion DIAS ‘IS’ a big data platform for earth environment data including observation data and SC model outputs. Thanks