-
Notifications
You must be signed in to change notification settings - Fork 0
/
compress.1
400 lines (400 loc) · 10.8 KB
/
compress.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
.Dd February 19, 2023
.Dt COMPRESS 1
.Os Dennix
.Sh NAME
.Nm compress ,
.Nm uncompress ,
.Nm zcat
.Nd compress and decompress data
.Sh SYNOPSIS
.Nm
.Op Fl 0123456789cdfghklnNOqrtvVz
.Op Fl b Ar level
.Op Fl m Ar algo
.Op Fl o Ar filename
.Op Fl S Ar suffix
.Op Fl T Ar threads
.Op Ar files...
.Nm uncompress
.Op Fl 0123456789cdfghklnNOqrtvVz
.Op Fl b Ar level
.Op Fl m Ar algo
.Op Fl o Ar filename
.Op Fl S Ar suffix
.Op Fl T Ar threads
.Op Ar files...
.Nm zcat
.Op Fl 0123456789cdfghklnNOqrtvVz
.Op Fl b Ar level
.Op Fl m Ar algo
.Op Fl o Ar filename
.Op Fl S Ar suffix
.Op Fl T Ar threads
.Op Ar files...
.Sh DESCRIPTION
The
.Nm
utility compresses or decompresses files.
Multiple compression algorithms are supported.
.Pp
The following options choose the operating mode of
.Nm :
.Bl -tag -width Ds
.It Fl d , -decompress , -uncompress
Decompress files.
Each operand is taken as an input file to be decompressed.
When an operand is named
.Cm - ,
or if no operands are given, the standard input is used as the input file and
the decompressed data is written to the standard output.
If an operand has no suffix associated with a supported algorithm, or if no file
named by the operand exists and the operand does not end with
.Pa .Z ,
the suffix
.Pa .Z
is appended to the operand and the resulting name is used as the input file.
.Pp
Unless the
.Fl c
option is specified, the algorithm used for decompression is the one associated
with the suffix of the input file.
When the
.Fl c
option is specified, the algorithm is instead determined by the file contents.
.It Fl l , -list
List information about compressed files.
The following information is listed:
.Bl -tag -width "uncompressed name"
.It compressed
The size of the compressed file.
.It uncompressed
The size of the uncompressed file.
.It ratio
The compression ratio by which the size of the file was reduced.
.It uncompressed name
The name that will be used for the output file when decompressing.
It the
.Fl N
option is specified, this will be the name saved inside the compressed file if
any, otherwise this is the name of the compressed file with its suffix removed.
.El
.Pp
If the
.Fl v
option is specified, the following additional information is printed in front of
the information above:
.Bl -tag -width "uncompressed name"
.It method
The algorithm that was used for compression.
This value is suitable for use as an argument to the
.Fl m
option.
.It crc
The 32-bit CRC (cyclic redundancy code) of the uncompressed data, if it is saved
within the compressed file.
For algorithms other than DEFLATE, this value is printed as
.Li ffffffff .
.It date , time
The modification date and time of the compressed data.
If the
.Fl N
option is specified, this will be the modification time saved inside the
compressed file if any, otherwise this is the modification time of the
compressed file itself.
.El
.Pp
A title line listing the information that is printed is also displayed, unless
the
.Fl q
option is specified.
.It Fl t , -test
Test the integrity of compressed files without changing any files.
.It Fl z , -compress
Compress files.
Each operand is taken as an input file to be compressed.
When an operand is named
.Cm - ,
or if no operands are given, the standard input is used as the input file and
the compressed data is written to the standard output.
.Pp
Unless the
.Fl c
option is specified, the compressed output is written into a file named like the
input file but with the suffix associated to the algorithm appended.
If the compression does not result in a size reduction and the
.Fl f
option is not given, the output file is unlinked and the input file is not
unlinked.
Otherwise, the input file is unlinked.
.El
.Pp
These options may be given multiple times.
Only the last of these options selects the mode.
The default mode for
.Nm
is compression
.Pq Fl z .
The
.Nm uncompress
utility is used for decompression and is equivalent to
.Nm
.Fl d .
The
.Nm zcat
utility decompresses files to the standard output and is equivalent to
.Nm
.Fl cd .
Both
.Nm uncompress
and
.Nm zcat
support all options supported by
.Nm .
.Pp
The following additional options are supported by
.Nm :
.Bl -tag -width Ds
.It Fl 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9
Set the compression level to the number given.
These options are equivalent to use of the
.Fl b
option with the number given as an option argument, except that the DEFLATE
(gzip) algorithm is used if no compression algorithm was selected explicitly
using the
.Fl g , m
or
.Fl O
options.
.It Fl b Ar level
Set the compression level.
The meaning depends on the compression algorithm used.
Usually a higher
.Ar level
achieves higher compression but might be slower.
For the LZW algorithm,
.Ar level
specifies the maximum number of bits to use in a code.
The values must be between 9 and 16 inclusive.
For the DEFLATE algorithm,
.Ar level
must be between 1 and 9 inclusive.
For the XZ format,
.Ar level
must be between 0 and 9 inclusive.
.It Fl c , -stdout , -to-stdout
Write the result of compression or decompression to the standard output and do
not unlink the input files.
For decompression, when the
.Fl c
option is used the algorithm is determined only by the file contents, not by the
file extension.
.It Fl f , -force
Always produce an output file even if compression does not result in a reduced
size.
Also overwrite existing output files without asking for confirmation.
.Pp
When used with both the
.Fl c
and
.Fl d
options, any files that are not in a recognized format are written to the
standard output without change.
.It Fl g
Use the DEFLATE (gzip) algorithm for compression.
Equivalent to
.Fl m Cm gzip .
.It Fl h , -help
Print a short help message and exit.
.It Fl k , -keep
Do not unlink input files.
.It Fl m Ar algo
For compression use the compression algorithm specified by
.Ar algo .
This option has no effect on decompression.
The following compression algorithms are supported:
.Bl -tag -width deflate
.It Cm deflate
The DEFLATE (gzip) algorithm.
The suffix associated with this algorithm is
.Pa .gz .
Additionally when decompressing, the suffix
.Pa .tgz
is supported which will be replaced by
.Pa .tar
to produce the output file name.
.It Cm gzip
Synonym for
.Cm deflate .
.It Cm lzw
The Lempel-Ziv-Welch (LZW) algorithm.
The suffix associated with this algorithm is
.Pa .Z .
Additionally when decompressing, the suffix
.Pa .taz
is supported which will be replaced by
.Pa .tar
to produce the output file name.
This algorithm is used by default.
.It Cm xz
The XZ format.
The suffix associated with this algorithm is
.Pa .xz .
Additionally when decompressing, the suffix
.Pa .txz
is supported which will be replaced by
.Pa .tar
to produce the output file name.
.El
.It Fl n , -no-name
When compressing, do not save the original file name and modification time in
the compressed file.
Undo the effects of any previously specified
.Fl N
option.
.It Fl N , -name
When decompressing, use the file name saved inside the compressed file if any as
the output file name and restore the saved modification time.
Note that not all algorithms support saving the original file name and
modification time.
Undo the effects of any previously specified
.Fl n
option.
.It Fl o Ar filename
Use
.Ar filename
as the output file name.
This option cannot be used with multiple input files or with any of the
.Fl clrt
options.
.It Fl O
Use the Lempel-Ziv-Welch (LZW) algorithm for compression.
Equivalent to
.Fl m Cm lzw .
.It Fl q , -quiet
Suppress any warnings.
Errors are still displayed.
Undo the effects of any previously specified
.Fl v
option.
.It Fl r , -recursive
Recursively compress or decompress directories.
For any given operand that names a directory, all files in that directory are
compressed or decompressed.
If a file in a directory already has the file extension associated with the used
algorithm, that file will be ignored.
.It Fl S Ar suffix , Fl -suffix Ns = Ns Ar suffix
When compressing, use
.Ar suffix
as the suffix of the output file instead of the suffix associated with the
algorithm.
.Pp
When decompressing, if the file named by an operand does not exist or has no
suffix associated with an algorithm, the operand is taken as the output file and
.Ar suffix
is appended to get the input name.
The algorithm for decompression is then determined by the file contents instead
of by the suffix.
.It Fl T Ar threads , Fl -threads Ns = Ns Ar threads
Use up to
.Ar threads
threads for compression.
Multithreading is currently only supported for XZ compression.
When
.Ar threads
is 0
.Nm
will use one thread per CPU core.
.Pp
If the
.Fl T
option is not used
.Nm
will use a number of threads determined by the number of CPUs and the amount of
available memory.
.It Fl v , -verbose
For each file print the size reduction or expansion of the file.
Undo the effects of any previously specified
.Fl q
option.
.It Fl V , -version
Print version information and exit.
.It Fl -best
Select the highest possible compression level.
.It Fl -fast
Select the lowest possible compression level.
.El
.Sh EXIT STATUS
The
.Nm
utility exits 0 if all input files were successfully compressed or decompressed.
It exits 1 or >2 when an error occured.
It exits 2 when at least one file was not compressed because compression would
have resulted in an increase of size.
.Pp
.Ex -std uncompress zcat
.Sh SEE ALSO
.Xr gzip 1 ,
.Xr tar 1 ,
.Xr xz 1
.Sh STANDARDS
The
.St -p1003.1-2008
standard specifies the
.Nm ,
.Nm uncompress
and
.Nm zcat
utilities as part of the XSI option.
Only the
.Fl bcfv
options and the Lempel-Ziv-Welch algorithm with between 9 and 14 bits per code
are standardized.
.Pp
The next revision of the POSIX.1 standard will additionally standardize the
.Fl dgm
options, add the DEFLATE algorithm and require support for 15 and 16 bits per
code for the LZW algorithm.
.Pp
All other options are extensions to the standard.
.Pp
The DEFLATE compression algorithm and the gzip file format are standardized in
RFC1951 and RFC1952.
.Sh HISTORY
The
.Nm
utility was first implemented by
.An Spencer Thomas
in 1984.
The first
.Nm
implementation that supported the DEFLATE algorithm was the version that
appeared in
.Ox 2.1
in 1997.
.Pp
dxcompress 1.0 (released in 2020) was the first
.Nm
implementation that implemented all new requirements from the latest
POSIX.1-202x drafts.
It was also the first
.Nm
implementation that supported XZ compression.
.Pp
Support for multithreaded XZ compression and the
.Fl Tz
options were added in dxcompress 1.1.
.Sh BUGS
The gzip format contains a timestamp field that cannot represent dates from the
year 2038 on.
The timestamp will be ignored when the date cannot be represented.
.Pp
When using the 9-bit LZW compression some codes will actually be stored as
a 10-bit value with the most significant bit being unused.
This is because of a bug in the original
.Nm
implementation that other implementations need to be compatible with.
However some
.Nm
implementations are known to handle 9-bit compression incorrectly resulting in
malformed files.
Thus 9-bit LZW compression should not be used.