Enhancing Modern Storage using Chunk-Based Data Deduplication in ABF-HTFC Algorithm

descriptionPublicationkeyboard_double_arrow_right Article 12 Nov 2025Publisher:Everant JournalsJournal:INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY RESEARCH AND ANALYSIS, volume 8 (issn: 2643-9840, eissn: 2643-9875,

Copyright policy )

Authors: Ashis Kumar Mohapatra;

doi: 10.47191/ijmra/v8-i11-22 , 10.5281/zenodo.17897417 , 10.5281/zenodo.17897418

Enhancing Modern Storage using Chunk-Based Data Deduplication in ABF-HTFC Algorithm

- Summary
- Subjects
- Metrics

Abstract

In modern times, cloud computing has become increasingly popular due to its ease of access, unlimited data storage, and payment capabilities. Additionally, data reduction is a widely used technique to minimize the storage of unnecessary data items and reduce maintenance overhead. Furthermore, research on data reduction in cloud-based systems is increasingly focused on the rapid growth of data volume in cloud storage services. However, valuable storage space is often lost when users upload multiple copies of duplicate data, and it is challenging to identify chunk files. To resolve this problem, we propose an Attribute-based Bloom Filter Hash Table with File Counting (ABF-HTFC) algorithm to remove redundant information and identify the storage using chunk-based data deduplication. Furthermore, boundary detection is speed up using the Fast Content-Defined Chunking (FastCDC) algorithm to achieve high-speed processing and effectively eliminate unnecessary storage. Next, data integrity and reliability in cloud environments can be ensured by using the Cryptographic Hashing - SHA-256 (CH-SHA-256) Algorithm to generate fingerprints and improve indexing efficiency. Finally, we propose an ABF-HTFC algorithm for data deduplication, which removes redundant chunk information and accurately identifies duplicate data in cloud storage. The proposed method outperforms the previous technique in identifying chunk files based on data deduplication. Furthermore, the proposed method was evaluated using storage performance metrics such as latency, throughput, cloud storage capacity, execution time, and deduplication ratio, and the storage efficiency is improved to 94.25%.

Keywords

Data deduplication, cloud storage, chunk file, hash function, bloom filter, FastCDC and CH-SHA-256.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green