Disk fragmentation defragmentation

IT/DevOps

Disk fragmentation defragmentation

2013. 12. 11. comments
반응형
오래전부터 windows를 사용하면서 디스크 조각 모음(defragmentation)을 PC가 느려졌다고 느끼거나 혹은 한해를 마무리 할 때마다 한번씩 실행하곤 했었다. 확실히 디스크 조각모음을 하게 되면 속도가 다시 빨라지는걸 느낄 수 있었지만, 문득 궁금한게 생겼다.

linux에서는 왜 디스크 조각 모음을 하지 않을까?

이 답을 찾기 위해서 일단 디스크 조각 모음의 원인과 정체를 찾아보았다.

fragmentation, defragmentation
windows에서의 디스크 조각 모음은 파일들을 사용/수정/삭제를 반복하면서 생기게 되는 단편화(fragmentation)를 해결하기 위한 프로그램이다. 예를 들어 하나의 파일이 디스크에 연속적으로 있으면 좋겠지만 여기저기 나눠서 저장하게 된다. 이렇게 나눠진 정보들을 순서대로 위치할 수 있도록 해주는 역할이 defragmentation이다.

그렇다면 왜 순서대로 다시 위치하게 해야 하는 것일까?

하드디스크는 헤드가 이동하면서 데이터를 읽어서 정보를 전달한다. 단편화는 헤드를 가진 디스크, 즉 seek 타임이 있는 하드웨어 한하여 영향을 끼치게 된다. 왜냐하면 사용해야 할 하나의 파일이 디스크의 여기저기 분산되어 있으면 헤드가 저장된 파일 위치로 이동해야 하는데 분산되어 있으면 있을 수록 헤드의 이동하는 시간이 더욱 필요하기 때문이다.그래서 헤드를 가지고 있지 않는 SSD에서는 단편화에 대한 성능 저하가 일어나지 않는다. 결국 defragmentation 과정으로 인하여 파일들이 종류별, 데이터 순서대로 위치하게 된다면 디스크 헤드 이동을 최소화 할 수 있다.

Defragmentation의 과정은 아래에 자세히 설명 되어 있다.
http://en.wikipedia.org/wiki/File:FragmentationDefragmentation.gif

또한 Disk I/O는 컴퓨터 세상에서 매우 느린편에 속하기 때문에 그 만큼 CPU도 다른 일을 못하고 기다리고 있기 때문에 속도 저하를 쉽게 느낄 수 있다.

defragmenter by filesystem
그렇다면 조각모음과 같은 defragmentation은 꼭 필요해 보이는 과정일 것이다. wiki에서 Defragmentation은 filesystem간의 관련성이 있음을 찾았다.

아래는 wikipedia에서 찾은 정보이다.
Approach and defragmenters by file-system type
FAT: MS-DOS 6.x and Windows 9x-systems come with a defragmentation utility called Defrag. The DOS version is a limited version of Norton SpeedDisk.^[9] The version that came with Windows 9x was licensed from Symantec Corporation, and the version that came with Windows 2000 and XP is licensed from Condusiv Technologies.
NTFS was introduced with Windows NT 3.1, but the NTFS filesystem driver did not include any defragmentation capabilities. In Windows NT 4.0, a few defragmenting APIs were introduced that third-party tools could use to perform defragmentation tasks; however, no defragmentation software was included. In Windows 2000, Windows XP and Windows Server 2003, Microsoft included a defragmentation tool based on Diskeeper^[10] that made use of the defragmentation APIs and was a snap-in for Computer Management. In Windows Vista, Windows 7 and Windows 8, Microsoft appears to have written their own defragmenter^{[citation needed]} which has no visual diskmap and is not part of Computer Management. There are also number of free and commercial third-party defragmentation products are available for Microsoft Windows.
BSD UFS and particularly FreeBSD uses an internal reallocator that seeks to reduce fragmentation right in the moment when the information is written to disk. This effectively controls system degradation after extended use.
Linux ext2, ext3, and ext4: Much like UFS, these filesystems employ allocation techniques designed to keep fragmentation under control at all times. As a result, defragmentation is not needed in the vast majority of cases. ext2 uses an offline defragmenter called e2defrag, which does not work with its successor ext3. However, other programs, or filesystem-independent ones, may be used to defragment an ext3 filesystem. ext4 is somewhat backward compatible with ext3, and thus has generally the same amount of support from defragmentation programs. In practice there are no stable and well-integrated defragmentation solutions for Linux, and thus no defragmentation is performed.

http://en.wikipedia.org/wiki/Defragmentation#Approach_and_defragmenters_by_file-system_type

파일시스템에서 defragmentation 툴을 직접 제공해주는 것도 있지만, 자체적으로 fragmentation을 줄일 수 있는 방안들의 기술들을 사용하고 있었음을 알 수 있고, 자세한 정보를 보기 위해서 linux ext4의 특징을 살펴보니 fragmentation을 줄일 수 있는 2가지 방안을 있음을 확인했다.

Features[edit]
Large file system
The ext4 filesystem can support volumes with sizes up to 1 exbibyte (EiB) and files with sizes up to 16 tebibytes (TiB).^[9] Volumes larger than 16 tebibytes (TiB) are not recommended.^[10]^[11]
Extents
Extents replace the traditional block mapping scheme used by ext2 and ext3. An extent is a range of contiguous physical blocks, improving large file performance and reducing fragmentation. A single extent in ext4 can map up to 128 MiB of contiguous space with a 4 KiB block size.^[1] There can be four extents stored in the inode. When there are more than four extents to a file, the rest of the extents are indexed in an HTree.
Backward compatibility
ext4 is backward compatible with ext3 and ext2, making it possible to mount ext3 and ext2 as ext4. This will slightly improve performance, because certain new features of ext4 can also be used with ext3 and ext2, such as the new block allocation algorithm.
ext3 is partially forward compatible with ext4. That is, ext4 can be mounted as ext3 (using "ext3" as the filesystem type when mounting). However, if the ext4 partition uses extents (a major new feature of ext4), then the ability to mount as ext3 is lost.
Persistent pre-allocation
ext4 can pre-allocate on-disk space for a file. To do this on most file systems, zeros would be written to the file when created. In ext4 (and some other files systems such as XFS) fallocate(), a new system call in the Linux kernel, can be used. The allocated space would be guaranteed and likely contiguous. This situation has applications for media streaming and databases.
Delayed allocation
ext4 uses a performance technique called allocate-on-flush also known as delayed allocation. That is, ext4 delays block allocation until it writes data to disk. (In contrast, some file systems allocate blocks before writing data to disk.) Delayed allocation improves performance and reduces fragmentation by using the actual file size to improve block allocation.
Increasing the 32,000 subdirectory limit
In ext3 a directory can have at most 32,000 subdirectories. Ext4 allows an unlimited number of subdirectories.^[12] To allow for larger directories and continued performance, ext4 turns on HTree indexes (a specialized version of a B-tree) by default. This feature is implemented in Linux 2.6.23. In ext3 HTrees can be used by enabling the dir_index feature.
Journal checksumming
ext4 uses checksums in the journal to improve reliability, since the journal is one of the most used files of the disk. This feature has a side benefit: it can safely avoid a disk I/O wait during journaling, improving performance slightly. Journal checksumming was inspired by a research paper from the University of Wisconsin, titled IRON File Systems^[13] (specifically, section 6, called "transaction checksums"), with modifications within the implementation of compound transactions performed by the IRON file system (originally proposed by Sam Naghshineh in the RedHat summit).
Faster file system checking
In ext4 unallocated block groups and sections of the inode table are marked as such. This enables e2fsck to skip them entirely and greatly reduces the time it takes to check the file system. Linux 2.6.24 implements this feature.
fsck time/Inode Count (ext3 vs. ext4)
Multiblock allocator
When ext3 appends to a file, it calls the block allocator, once for each block. Consequently, if there are multiple concurrent writers, files can easily become fragmented on disk. However, ext4 uses delayed allocation which allows it to buffer data and allocate groups of blocks. Consequently the multiblock allocator can make better choices about allocating files contiguously on disk. The multiblock allocator can also be used when files are opened in O_DIRECT mode. This feature does not affect the disk format.
Improved timestamps
As computers become faster in general and as Linux becomes used more for mission-critical applications, the granularity of second-based timestamps becomes insufficient. To solve this, ext4 provides timestampsmeasured in nanoseconds. In addition, 2 bits of the expanded timestamp field are added to the most significant bits of the seconds field of the timestamps to defer the year 2038 problem for an additional 204 years.
ext4 also adds support for date-created timestamps. But, as Theodore Ts'o points out, while it is easy to add an extra creation-date field in the inode (thus technically enabling support for date-created timestamps in ext4), it is more difficult to modify or add the necessary system calls, like stat() (which would probably require a new version) and the various libraries that depend on them (like glibc). These changes would require coordination of many projects. So even if ext4 developers implement initial support for creation-date timestamps, this feature will not be available to user programs for now.^[14]
http://en.wikipedia.org/wiki/Ext4#Features

extent
extent란 단어적인 의미로는 범위로써 파일을 위해 미리 block단위로 연속적인 공간을 할당을 함으로써 단편화 현상을 줄이는 기술을 의미한다. 즉 할당/삭제/변경등의 작업을 하더라도 데이터가 뒤죽박죽 되는 것을 막음으로써 큰 피해를 막는 가장 좋은 행위라 말할 수 있다. 이러한 기술은 linux filesystem뿐만 아니라 windows의 NTFS에서도 사용되고 있다.

An extent is a contiguous area of storage in a computer file system, reserved for a file. When a process creates a file, file-system management software allocates a whole extent. When writing to the file again, possibly after doing other write operations, the data continues where the previous write left off. This reduces or eliminates file fragmentation and possibly file scattering too.
An extent-based file system (i.e., one that addresses storage via extents rather than in single blocks) need not require limiting each file to a single, contiguous extent.
The following systems support extents:
ASM - Automatic Storage Management - Oracle's database-oriented filesystem.
BFS - BeOS, Zeta and Haiku operating systems.
Btrfs - GPL'd extent based file storage for Linux.
Ext4 - Linux filesystem (when the configuration enables extents — the default in Linux since version 2.6.23).
Files-11 - Digital Equipment Corporation (subsequently Hewlett-Packard) OpenVMS filesystem.
HFS and HFS Plus - Hierarchical File System - Apple Macintosh filesystems.
HPFS - High Performance File Syzstem - OS/2 and eComStation.
JFS - Journaled File System - Used by AIX, OS/2/eComStation and Linux operating systems.
Melio FS - a shared disk file system for Windows from Sanbolic.
Microsoft SQL Server - Versions 2000-2008 supports extents of up to 64KB [1]

.
Multi-Programming Executive - Filesystem by Hewlett-Packard.
NTFS - Microsoft's latest-generation file system
OCFS2 - Oracle Cluster File System - a shared disk file system for Linux.
Reiser4 - Linux filesystem (in "extents" mode).
SINTRAN III - File system used by early computer company Norsk Data.
UDF - Universal Disk Format - Standard for optical media.
VERITAS File System - Enabled via the pre-allocation API and CLI.
XFS - SGI's second generation file system.
http://en.wikipedia.org/wiki/File_system_fragmentation#Cause

delayed allocation
저장할 데이터가 있을 경우 즉시 저장하는게 아니라 dirty buffer라는 공간에 block단위로 모았다가 디스크에 지연 작성한다. I/O에 대한 횟수를 줄임으로써 CPU 사용율도 줄일 수 있고, 일정 크기를 모아서 작성하기 때문에 데이터들이 단편화되는 현상 역시 줄일 수 있다.

Allocate-on-flush (also called delayed allocation) is a computer file system feature implemented in the HFS+,^[1] XFS, Reiser4, ZFS, Btrfs and ext4^[2] file systems. The feature also closely resembles an older technique that Berkeley's UFS called "block reallocation".
When blocks must be allocated to hold pending writes, disk space for the appended data is subtracted from the free-space counter, but not actually allocated in the free-space bitmap. Instead, the appended data is held in memory until it must be flushed to storage due to memory pressure, when the kernel decides to flush dirty buffers, or when the application performs the Unix "sync" system call, for example.
This has the effect of batching together allocations into larger runs. Such delayed processing reduces CPU usage, and tends to reduce disk fragmentation, especially for files which grow slowly. It can also help in keeping allocations contiguous when there are several files growing at the same time. When used in conjunction with copy on write as it is in ZFS, it can convert slow random writes into fast sequential writes.

http://en.wikipedia.org/wiki/Delayed_allocation

결론
예전에 사용하던 legacy 윈도우의 파일 시스템은 단편화를 고려한 기술이 포함되지 않았을 것이고, 개인 PC에서 데이터의 크기가 작고 종류도 많았었기 때문에 단편화가 더욱 많이 발생했을 것이다.
최근에는 단편화를 줄여주는 기술들이 파일시스템에 포함되고 있는데 linux에서는 ext3를 거쳐 ext4에서는 더욱 단편화가 발생되지 않게 도와주기 때문에 defragmentation을 해주지 않아도 된다. 엄밀히 말해서는 fragmentation이 전혀 생기지 않는다는 말보다는 발생확률이 낮다라고 표현해야 할 것이다.

그리고 일반적으로 linux에서는 시스템 디렉토리들을 목적에 따라 파티션으로 나눠서 운용되기 때문에 논리적으로 디스크가 나눠져 있으면 fragmentation이 덜 발생할 것이고 fragmentation으로 인한 시스템 성능 저하를 줄일 수 있을 것이다.
반응형

저작자표시 비영리 동일조건 (새창열림)
관련글 관련글 더보기
댓글

ABOUT ME

morenice's blog

Approach and defragmenters by file-system type

Features[edit]

티스토리툴바

ABOUT ME

Approach and defragmenters by file-system type

Features[edit]

관련글 관련글 더보기

티스토리툴바