Split and merge files from the command line
En Español  
                    Although some file archivers offer us the option of split the files,   this can be easily accomplished with two commands: split   and cat.
Topics
Splitting a file with split
Merging   the parts that were created with split
Split a file per lines
Splitting a file with split
split just needs the size of the parts that we want to   create, and the file that we want to split, e.g.:
split -b 1024 file_to_split.binIf this file is 6 kibibytes   long, it will create 6 files of 1 kibibyte each, named xaa, xab,   xac, xad, xae and xaf.
The parameter -b is what defines the size of the resulting   parts. You can use suffixes, either the SI suffixes KB (Kilobyte: 1000   bytes), MB (Megabyte: 1000×1000 bytes), GB (Gigabyte: 1000×1000×1000   bytes), TB, PB, EB, ZB, YB. Or you can use the EIC suffixes K (Kibibyte:   1024 bytes), M (Mibibyte: 1024×1024 bytes), G (Gibibyte: 1024×1024×1024   bytes), T, P, E, Z, Y. E.g.:
split -b 1K file_to_split.bin
split -b 10M   file_to_split.bin
split -b 1KB file_to_split.bin
split -b 10MB   file_to_split.binThe past examples would create parts of 1,024 bytes, 10,485,760 bytes, 1,000 bytes and 10,000,000 bytes respectively.
If we don't want the default prefix x, we can change it by   adding the new prefix after the name of the file that we want to split,   e.g.:
split -b 1024 file_to_split.bin a_part_If we are splitting a file of 6 kibibytes as in the first example, this   would generate the files: a_part_aa, a_part_ab,   a_part_ac, a_part_ad, a_part_ae   and a_part_af.
We can change the length of the suffix in the resulting files, and we can choose between an alphabet based suffix (the default) or a numeric suffix. How many parts we can create depends of this two features of the suffix. If we keep the default length of 2, and don't use a numeric suffix, we can split a file in up to 676 parts. If split runs out of suffixes, it will fail, leaving us with the files created until the moment it failed.
To change the length of the suffix use the parameter -a   followed by a number. E.g.:
split -b 1024 -a 4 file_to_split.binFollowing the first given example again, we would end with the files: xaaaa,   xaaab, xaaac, xaaad, xaaae   and xaaaf.
To use a numeric suffix use the parameter -d, of course   with a numeric suffix and a length of 2 we can split a file in up to 100   parts. E.g.:
split -b 1024 -d file_to_split.binAnd using the first example one last time, we would end with the files: x00,   x01, x02, x03, x04   and x05.
Merging the parts that were created with split
Since the files created with the split command are   sequential, we can simply use cat to merge this files into   a new file, e.g.:
cat x?? > reconstructed_file.binThe question mark acts as a wild-card character. How many question marks do we use depends of course of the length of the suffixes used when creating the parts.
Split a file per lines
split also allow us to split a text file per lines rather   than per the size of the resulting parts. I am sure this was very useful   at some point in the past, but I can't think of a reason to split a text   file per lines other than for experimental or didactic purposes, text   processors are quite capable of dealing with very large text files (great, just days after writing this I found a good application, splitting long lists of URLs and splitting those long MySQL files full of commands so we can upload them to those web-based systems that can't handle big files, I left this comment for humorous purposes).   Nevertheless, the parameter to split a text file per lines is -l   followed by the number of lines. E.g.:
split -l 20 file_to_split.txtThis will create one file for every chunk of 20 lines in the original   file, so if the file had 54 lines, it would create the files xaa   (lines 1-20), xab (lines 21-40) and xac (lines   41-54).
There is another option in split, a mix between splitting   the file per size in bytes and per number of lines. In this mode we   specify a maximum size in bytes for the parts, and split   will fit as many complete lines as possible in each part without   exceeding the specified size in bytes. For this we use the parameter -C   followed by a number representing the size in bytes, you can use any of   the suffixes that are valid for the option -b. E.g.: Assume   that we have a file that contains 20 lines of 100 characters each,   totalling 2000 bytes, and we use the following command:
split -C 512 file_to_split.txtThis would give us the files: xaa (lines 1-5, since the   sixth line, having 100 characters, would not fit in the 512 bytes that   we set as limit), xab (lines 6-10), xac (lines   11-15) and xad (lines 16-20). Each file would have a size   of 500 bytes.