• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于生成矩阵变换的跨数据中心纠删码写入方法

包涵, 王意洁, 许方亮

包涵, 王意洁, 许方亮. 基于生成矩阵变换的跨数据中心纠删码写入方法[J]. 计算机研究与发展, 2020, 57(2): 291-305. DOI: 10.7544/issn1000-1239.2020.20190542
引用本文: 包涵, 王意洁, 许方亮. 基于生成矩阵变换的跨数据中心纠删码写入方法[J]. 计算机研究与发展, 2020, 57(2): 291-305. DOI: 10.7544/issn1000-1239.2020.20190542
Bao Han, Wang Yijie, Xu Fangliang. A Cross-Datacenter Erasure Code Writing Method Based on Generator Matrix Transformation[J]. Journal of Computer Research and Development, 2020, 57(2): 291-305. DOI: 10.7544/issn1000-1239.2020.20190542
Citation: Bao Han, Wang Yijie, Xu Fangliang. A Cross-Datacenter Erasure Code Writing Method Based on Generator Matrix Transformation[J]. Journal of Computer Research and Development, 2020, 57(2): 291-305. DOI: 10.7544/issn1000-1239.2020.20190542
包涵, 王意洁, 许方亮. 基于生成矩阵变换的跨数据中心纠删码写入方法[J]. 计算机研究与发展, 2020, 57(2): 291-305. CSTR: 32373.14.issn1000-1239.2020.20190542
引用本文: 包涵, 王意洁, 许方亮. 基于生成矩阵变换的跨数据中心纠删码写入方法[J]. 计算机研究与发展, 2020, 57(2): 291-305. CSTR: 32373.14.issn1000-1239.2020.20190542
Bao Han, Wang Yijie, Xu Fangliang. A Cross-Datacenter Erasure Code Writing Method Based on Generator Matrix Transformation[J]. Journal of Computer Research and Development, 2020, 57(2): 291-305. CSTR: 32373.14.issn1000-1239.2020.20190542
Citation: Bao Han, Wang Yijie, Xu Fangliang. A Cross-Datacenter Erasure Code Writing Method Based on Generator Matrix Transformation[J]. Journal of Computer Research and Development, 2020, 57(2): 291-305. CSTR: 32373.14.issn1000-1239.2020.20190542

基于生成矩阵变换的跨数据中心纠删码写入方法

基金项目: 国家重点研发计划项目(2016YFB1000101);国家自然科学基金项目(61379052);教育部科研创新基金项目(2018A02002);湖南省自然科学杰出青年基金项目(14JJ1026)
详细信息
  • 中图分类号: TP302.8

A Cross-Datacenter Erasure Code Writing Method Based on Generator Matrix Transformation

Funds: This work was supported by the National Key Research and Development Program of China (2016YFB1000101), the National Natural Science Foundation of China (61379052), the Science Foundation of Ministry of Education of China (2018A02002), and the Natural Science Foundation for Distinguished Young Scholars of Hunan Province (14JJ1026).
  • 摘要: 近年来,为了避免数据因数据中心故障而永久丢失,各大机构开始尝试采用容错技术将数据存放在跨数据中心存储系统中.作为一种具有高容错性和低冗余度的容错技术,纠删码被广泛应用于单数据中心存储系统中.然而,在跨数据中心存储系统中,已有纠删码写入方法的网络资源消耗量大、编码效率低且传输效率低,这使得跨数据中心纠删码的写入速度难以适应于日益增长的数据生成速度.为提高跨数据中心纠删码的写入速度,提出了一种基于生成矩阵变换的跨数据中心纠删码写入方法(cross-datacenter erasure code writing method based on generator matrix transformation, CREW).通过对传输拓扑和生成矩阵进行优化,CREW可使写入过程中需要长距离传输的数据块尽可能地少,从而达到降低网络资源消耗量的目的.通过在数据中心间采用分布式的数据传输和数据编码、在各数据中心内部采用集中式的数据传输和数据编码,CREW可在编码效率和传输效率间取得较好权衡.在跨数据中心环境下的实验表明:与2种广泛使用的传统纠删码写入方法相比,CREW的写入速度提高了36.3%~57.9%;与现有的跨数据中心纠删码写入方法IncEncoding相比,CREW的写入速度提高了32.4%.
    Abstract: In cross-datacenter storage systems, existing writing methods of erasure code usually has low encoding efficiency, low transmission efficiency, and large network resource consumption. Therefore, cross-datacenters erasure code usually has a low writing rate. This paper proposes a cross-datacenter erasure code writing method based on generator matrix transformation called CREW. Specifically, we first propose a greedy strategy-based transmission topology construction algorithm called GBTC, which can construct a tree-structured transmission topology with incremental weights (the weights are set to the network distances between datacenters) from top to bottom to organize data transmission between datacenters. Then, we propose a generator matrix transformation algorithm called GMT. Without changing the linear relationship of coded blocks, GMT can transform the generator matrix so that the number of data blocks related to a coded block is negatively correlated with the network distance between the datacenter where the coded block is located and the root of the tree-structured topology. Therefore, CREW only needs to transfer a small number of data blocks through a long network distance to write data. Thus, the network resource consumption is reduced. Finally, we propose a distributed pipelined writing algorithm called DPW to distribute encoding operations to different nodes for parallel execution and limit the number of forwards of data blocks, thereby improving encoding efficiency and transmission efficiency. Experiments show that compared with writing methods of traditional erasure code, the write rate of CREW is increased by 36.3%~57.9%. And compared with the existing writing method of cross-datacenter erasure code (IncEncoding), the writing rate of CREW is increased by 32.4%.
  • 期刊类型引用(6)

    1. 张凯鑫 ,王意洁 ,包涵 ,阚浚晖 . 面向存算联调的跨云纠删码自适应数据访问方法. 计算机研究与发展. 2024(03): 571-588 . 本站查看
    2. 周杨,王春林,郭锐. 基于随机森林算法的数据中心运维异常告警方法. 现代电子技术. 2023(08): 143-148 . 百度学术
    3. 包涵,王意洁. 低跨云数据中心修复流量的纠删码的快速构造方法. 计算机研究与发展. 2023(10): 2418-2439 . 本站查看
    4. 刘元莹. 基于目录哈希树的电力通信网络数据容灾备份. 电子设计工程. 2022(19): 102-105+110 . 百度学术
    5. 张淑清. 基于哈希计算的大数据冗余消除算法设计. 微型电脑应用. 2021(12): 68-70 . 百度学术
    6. 陈建兵,梁立,叶志霞. 有限拓扑的编码算法. 云南师范大学学报(自然科学版). 2020(05): 42-46 . 百度学术

    其他类型引用(5)

计量
  • 文章访问数:  874
  • HTML全文浏览量:  1
  • PDF下载量:  237
  • 被引次数: 11
出版历程
  • 发布日期:  2020-01-31

目录

    /

    返回文章
    返回