Performance Analysis of Two Big Data Technologies on a Cloud Distributed Architecture. Results for Non-Aggregate Queries on Medium-Sized Data
作者: Marin FotacheIonuț Hrubaru
作者单位: 11Faculty of Economics and Business Administration, Alexandru Ioan Cuza University of Iaşi, Romania
刊名: Scientific Annals of Economics and Business, 2016, Vol.63 (s1), pp.21-50
来源数据库: De Gruyter Journal
DOI: 10.1515/saeb-2016-0134
关键词: Big Datacloud computingperformance benchmarksHadoopHivePostgreSQLPostgres XLR
原始语种摘要: Abstract Big Data systems manage and process huge volumes of data constantly generated by various technologies in a myriad of formats. Big Data advocates (and preachers) have claimed that, relative to classical, relational/SQL Data Base Management Systems, Big Data technologies such as NoSQL, Hadoop and in-memory data stores perform better. This paper compares data processing performance of two systems belonging to SQL (PostgreSQL/Postgres XL) and Big Data (Hadoop/Hive) camps on a distributed five-node cluster deployed in cloud. Unlike benchmarks in use (YCSB, TPC), a series of R modules were devised for generating random non-aggregate queries on different subschema (with increasing data size) of TPC-H database. Overall performance of the two systems was compared. Subsequently a number of...
全文获取路径: De Gruyter 

  • Data 数据
  • relational 有关系的
  • subschema 子模式
  • processing 加工
  • database 资料库
  • query 查询
  • distributed 分布的
  • performance 性能
  • cluster 
  • stores 备品