This file is indexed.

/usr/share/doc/png-definitive-guide/html/chapter09.html is in png-definitive-guide 20060430-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
  "http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<HEAD>
<TITLE>Compression and Filtering (PNG: The Definitive Guide)</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

<!-- http://www.w3.org/TR/REC-CSS2/box.html -->
<STYLE TYPE="text/css">
  P { margin-bottom: 0em }
  UL {
    margin-bottom: 0em;
    margin-top: 0em;
    list-style: disc;
  }
  LI {
    padding: 0px 0px 0px 0px;
    margin: 0px 0px 0px 0px;
  }
</STYLE>

<LINK REV="made" HREF="http://pobox.com/~newt/greg_contact.html">
<!--  Copyright (c) 1999 O'Reilly and Associates.  -->
<!--  Copyright (c) 2002-2006 Greg Roelofs.  -->
</HEAD>

<body bgcolor="#ffffff" text="#000000">


<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->

<a href="chapter08.html"><img width=24 height=13 border=0 align="left"
 src="images/prev.png" alt="&lt;-"></a>

<a href="chapter10.html"><img width=24 height=13 border=0 align="right"
 src="images/next.png" alt="-&gt;"></a>

<div align="center">
  <a href="chapter08.html"><font size="-1" color="#000000"
   ><b>PREVIOUS</b></font></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a
     href="toc.html"><font size="-1" color="#000000"
   ><b>CONTENTS</b></font></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a
     href="chapter10.html"><font size="-1" color="#000000"
   ><b>NEXT</b></font></a>
</div>

<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->


<h1 class="chapter">Chapter 9. Compression and Filtering</h1>

<div class="htmltoc"><h4 class="tochead">Contents:</h4><p>
<a href="#png.ch09.div.1">9.1. Filtering</a><br />
<a href="#png.ch09.div.2">9.2. The Deflate Compression Algorithm</a><br />
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#png.ch09.div.2.1">9.2.1. A Final Word on Patents</a><br />
<a href="#png.ch09.div.3">9.3. Real-World Comparisons</a><br />
<a href="#png.ch09.div.4">9.4. Practical Compression Tips</a><br />
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#png.ch09.div.4.1">9.4.1. Tips for Users</a><br />         
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#png.ch09.div.4.2">9.4.2. Tips for Programmers</a><br />   
</p></div>


<p>One of PNG's strengths, particularly in comparison to the GIF and TIFF image
formats, is its compression. As I noted in 
<a href="chapter01.html">Chapter 1, "An Introduction to PNG"</a>, a primary motivation 
driving the design of the Portable Network
Graphics format was to create a replacement for GIF that was not only free
but also an improvement over it in essentially all respects. As a result,
PNG compression is completely lossless--that is, the original image data can be reconstructed exactly, bit for
bit--just as in GIF and most forms of TIFF.<a href="#FOOTNOTE-67">[67]</a>

</p><blockquote class="footnote">
<a name="FOOTNOTE-67" />
<a name="INDEX-672" />
<a name="INDEX-673" />
<p>
[67] And as a corollary, PNG file sizes are usually considerably larger
than ordinary JPEG, since the latter uses lossy compression--that is,
it throws away some information. TIFF also supports JPEG compression as
one of its many options, but the more common methods are lossless and
based on either run-length encoding (RLE) or the same LZW algorithm used
in GIF.
</p>
</blockquote>


<!--    GRR POST-1999 UPDATE    -->
<font color="#006600">

<p>
<em>I wrote a longer, more technically detailed chapter on PNG compression
for the</em> <a href="http://www.elsevier-international.com/catalogue/title.cfm?ISBN=0126208611">Lossless Compression Handbook</a>,
<em>edited by Khalid Sayood and published in December 2002 by Academic Press
(now Elsevier Science).  It includes more rigorous test data, as well.  A
near-final draft is available in PDF format at the following link:</em>
</p>

    <!-- was "http://www.amazon.com/exec/obidos/ASIN/0126208611" -->
    <!-- was
         "http://www.apcatalog.com/cgi-bin/AP?ISBN=0126208611&amp;FORM=FORM2"
         should be visible at US site, but usually comes up missing:
         "http://www.bhusa.com/apcatalog/us/subindex.asp?isbn=0126208611"
     -->

<blockquote>
<p>
<pre><a href=
    "LCH-png-chapter.pdf"
    >http://www.libpng.org/pub/png/book/LCH-png-chapter.pdf</a></pre>
</p>
</blockquote>

<p>
<em>I will update it to the final version and convert it to HTML format
when time permits.</em>
</p>

</font>
<!--    END GRR POST-1999 UPDATE    -->


<div class="sect1"><a name="png.ch09.div.1" />
<h2 class="sect1">9.1. Filtering</h2>


<p><a name="INDEX-674" />We'll look at the compression engine itself shortly, but PNG's performance
is not due solely to an improved compression algorithm. PNG also supports a
precompression step called <em class="emphasis">filtering</em>.
Filtering is a method of reversibly transforming the image data so that the
main compression engine can operate more efficiently. As a simple example,
consider a sequence of bytes increasing uniformly from 1 to 255. Since there
is no repetition in the sequence, it compresses either very poorly or not at
all. But a trivial modification of the sequence--namely, leaving the first
byte alone but replacing each subsequent byte by the difference between it
and its predecessor--transforms the sequence into an extremely compressible
set of 255 identical bytes, each having the value 1.</p>


<p>As a real-life example of this (though still not particularly realistic),
consider the image known as <em class="emphasis">16million.png</em>. This
24-bit, 512&nbsp;&times; 32,768 RGB image contains one pixel of every
possible color--more than 16 million of them altogether. As raw data, it
therefore requires 48&nbsp;MB to
store. Simple PNG-style compression with no filtering brings it down to
36&nbsp;MB, only a 25% reduction in size. But with filtering turned on, the same
compression engine reduces it to 115,989 bytes, more than 300 times better
than the nonfiltered case, for a total compression factor of 434!<a href="#FOOTNOTE-68">[68]</a>


<em class="emphasis">Zowie</em>.</p><blockquote class="footnote">


<a name="FOOTNOTE-68" /><p>[68] Actually, it gets even better. The dimensions of the image were chosen
for convenient web-browser scrolling, but a 4096&nbsp;&times; 4096 version created
by Paul Schmidt is half the size--a mere 59,852 bytes (841 times compression).
And just wait until we get to the chapter on MNG!</p>


</blockquote>


<p>Actual image data is rarely that perfect, but filtering does improve
compression in grayscale and truecolor images, and it can help on some
palette images as well. PNG supports five types of filters, and an encoder
may choose to use a different filter for each row of pixels in the image.
<a href="#png.ch09.tbl.1">Table 9-1</a> lists the five filter types.
</p>


<a name="png.ch09.tbl.1" />
<div class="table" align="center">

<p>
<table width="502" border="0">
  <tr>
    <td>
      <b class="emphasis-bold">Table 9-1.</b>
      <i>PNG Filter Types</i>
    </td>
  </tr>
</table>
</p>

<p>
<table width="502" border="1">

<tr>
<td><b class="emphasis-bold">&nbsp;Name&nbsp;</b></td>
<td><b class="emphasis-bold">Description</b></td>
</tr>

<tr>
<td>&nbsp;None&nbsp;</td>
<td>Each byte is unchanged.</td>
</tr>

<tr>
<td>&nbsp;Sub&nbsp;</td>
<td><p>Each byte is replaced with the difference between it and the
``corresponding byte'' to its left.</p></td>
</tr>

<tr>
<td>&nbsp;Up&nbsp;</td>
<td><p>Each byte is replaced with the difference between it and the byte
above it (in the previous row, as it was before filtering).
<a name="INDEX-675" /></p></td>
</tr>

<tr>
<td>&nbsp;Average&nbsp;</td>
<td><p>Each byte is replaced with the difference between it and the average
of the corresponding bytes to its left and above it, truncating any
fractional part.
<a name="INDEX-676" /></p></td>
</tr>

<tr>
<td>&nbsp;Paeth&nbsp;</td>
<td><p>Each byte is replaced with the difference between it and the <em class="emphasis">Paeth
predictor</em> of the corresponding bytes to its left, above it, and to
its upper left.
<?x-space 5p?></p></td>
</tr>

</table>
</p>
</div>



<p><a name="INDEX-677" />
<a name="INDEX-678" />
<a name="INDEX-679" />The last method requires some explanation. Invented by Alan Paeth, the
Paeth predictor is computed by first calculating a <em class="emphasis">base value</em>, equal
to the sum of the corresponding bytes to the left and above, minus the byte
to the upper left. (For example, the base value might equal
228 + 228 - 227 = 229.) Then the difference between the base value
and each of the three corresponding bytes is calculated, and the byte that
gave the smallest absolute difference--that is, the one that was closest
to the base value--is used as the predictor and subtracted from the target
byte to give the filtered value. In case of ties, the corresponding byte
to the left has precedence as the predicted value, followed by the one
directly above. Note that all calculations to produce the Paeth predictor
are done using exact integer arithmetic. The final filter calculation, on
the other hand, is done using base-256 modular arithmetic; this is true for
all of the filter types.
</p>


<p>Though the concept is simple, there are quite a few subtleties in the actual
mechanics of filtering. Most important among these is that filtering always
operates on bytes, not pixels. For images with pixels smaller than eight bits,
this means that the filter algorithms actually operate on more than one pixel
at a time; for example, in a 2-bit palette or grayscale image, there are
four pixels per byte. This approach improves the efficiency of decoders
by avoiding bit-level manipulations.</p>


<p>At the other end of the spectrum, large pixels (e.g., 24-bit RGB or 64-bit
RGBA) are also operated on as bytes, but only <em class="emphasis">corresponding</em> bytes are
compared. For any given byte, the corresponding byte to its left is the
one offset by the number of bytes per pixel. This means that red bytes in a
truecolor image are compared with red bytes, green with green, and blue with
blue. If there's an alpha channel, the alpha bytes are always compared;
if the sample depth is 16 bits, upper (most significant) bytes are compared
with upper bytes in the same color channel, and lower bytes are compared with
lower. In other words, similar values will always be compared and operated
on, in hopes of improving compression efficiency. Consider an RGB image,
for example; the red, green, and blue values of any given pixel may be quite
different, but neighboring pairs of red, green, and blue will often be
similar. Thus the transformed bytes will tend to be close to zero even if
the original bytes weren't. This is the real point of filtering: most of
the transformed bytes will cluster around zero, thus giving the compression
engine a smaller, more predictable range of byte values to cope with.
</p>


<p><a name="INDEX-680" />What about edges? If the ``corresponding byte'' to the left or above doesn't
exist, the algorithm does not wrap around and use bytes from the other side
of the image; instead, it treats the missing byte as zero. The wraparound
method was, in fact, considered, but aside from the fact that one cannot wrap
the top edge of the image without completely breaking the ability to stream
and progressively display a PNG image, the designers felt that only a few
images would benefit (and minimally, at that), which did not justify the
potential additional complexity.</p>


<p><a name="INDEX-681" />Interlacing is also a bit of a wrench in the works. For the purposes of
filtering, each interlace pass is treated as a separate image with its own
width and height. For example, in a 256&nbsp;&times; 256 interlaced image, the
passes would be treated as seven smaller images with dimensions 32&nbsp;&times; 32,
32&nbsp;&times; 32, 64&nbsp;&times; 32, 64&nbsp;&times; 64, 128&nbsp;&times; 64, 128&nbsp;&times; 128, and
256&nbsp;&times; 128, respectively.<a href="#FOOTNOTE-69">[69]</a>
This avoids the nasty problem of how to define corresponding bytes between
rows of different widths.</p><blockquote class="footnote">


<a name="FOOTNOTE-69" /><p>[69] Yes, that adds up to the right number of pixels. (Go ahead, add it
up.) Note that things may not come out quite so cleanly in cases in which
one or both image dimensions are not evenly divisible by eight; the
width of each pass is rounded up, if necessary.</p>


</blockquote>


<p><a name="INDEX-682" />So how does an encoder actually choose the proper filter for each row?
Testing all possible combinations is clearly impossible: even a 20-row
image would require testing over 95 trillion combinations, where ``testing''
would involve filtering and compressing the entire image. A simpler approach,
though still computationally expensive, is to incrementally test-compress each
row, save the smallest result, and repeat for the next row. This amounts to
filtering and compressing the entire image five times, which may be a
reasonable trade-off for an image that will be transmitted and decoded many
times.</p>


<p>But users often have barely enough patience to wait for a single round of
compression, so the PNG development group has come up with a few rules of thumb
(or heuristics) for choosing filters wisely. The first rule is that filters
<a name="INDEX-683" />
<a name="INDEX-684" />are rarely useful on palette images, so don't even bother with them. Note,
however, that one has considerable freedom in choosing how to order entries
in the palette itself, so it is possible that a particular method of ordering
would actually result in image data that benefits significantly from filtering.
No one has yet proven this, however, and the most likely approaches would
involve doing statistics on every pair of pixels in the image. Such
approaches would be quite expensive for larger images.</p>


<p>Filters are also rarely useful on low-bit-depth (grayscale) images, although
there have been rare cases in which promoting such an image to 8 bits and then
filtering has been effective. In general, however, filter type None is best.</p>


<p><a name="INDEX-685" />
<a name="INDEX-686" />
<a name="INDEX-687" />
<a name="INDEX-688" />For grayscale and truecolor images of 8 or more bits per sample, with or
without alpha channels, dynamic filtering is almost always beneficial. The
approach that has by now become standard is known as the <em class="emphasis">minimum sum of
absolute differences</em> heuristic and was first proposed by Lee Daniel Crocker
in February 1995. In this approach, the filtered bytes are treated as signed
values--that is, any value over 127 is treated as negative; 128 becomes
-128 and 255 becomes -1. The absolute value of each is then summed, and
the filter type that produces the smallest sum is chosen. This approach
effectively gives preference to sequences that are close to zero and
therefore is biased against filter type None.</p>


<p><a name="INDEX-689" />A related heuristic--still experimental at the time of this writing--is
to use the <em class="emphasis">weighted</em> sum of absolute differences. The theory, to some
extent based on empirical evidence, is that switching filters too often can
have a deleterious effect on the main compression engine. A better approach
might be to favor the most recently used filter even if its absolute sum of
differences is slightly larger than that of other filters, in order to produce
a more homogeneous data stream for the compressor--in effect, to allow
short-term losses in return for long-term gains. The standard PNG library
contains code to enable this heuristic, but a considerable amount of
experimentation is yet to be done to determine the best combination of
weighting factors, compression levels, and image types.</p>


<p>One can also imagine heuristics involving higher-order distance metrics (e.g.,
root-mean-square sums), sliding averages, and other statistical methods, but
to date there has been little research in this area. Lossless compression is
a necessity for many applications, but cutting-edge research in image
compression tends to focus almost exclusively on lossy methods, since the
payoff there is so much greater. Even within the lossless domain,
preconditioning the data stream is likely to have less effect than changing
the back-end compression algorithm itself. So let's take a look at that
next.
<a name="INDEX-690" /></p>
</div>

















</div>
<div class="sect1"><a name="png.ch09.div.2" />
<h2 class="sect1">9.2. The Deflate Compression Algorithm</h2>


<p><a name="INDEX-691" />
<a name="INDEX-692" />
<a name="INDEX-693" />In some ways compression is responsible for the very existence of the Portable
Network Graphics format (recall <a href="chapter01.html">Chapter 1, "An Introduction to PNG"</a>), and it is undoubtedly one of the
most important components of PNG. The PNG specification defines a single
compression method, the <em class="emphasis">deflate</em> algorithm, for all image types.</p>


<p><a name="INDEX-694" />Part of the LZ77 class of compression algorithms, deflate was defined
by PKWARE in 1991 as part of the 1.93a beta version of their PKZIP
<a name="INDEX-695" />
archiver. Independently implemented by Jean-loup Gailly
<a name="INDEX-696" />
and Mark Adler,
<a name="INDEX-696.01-new" />
first for Info-ZIP's Zip and UnZip utilities and shortly
<a name="INDEX-696.02-new" />
thereafter for the GNU <em class="emphasis">gzip</em> utility, the deflate algorithm is
battle-tested and today is probably the most commonly used
file-compression algorithm on the Internet. Although it is not the
best-compressing algorithm known,<a href="#FOOTNOTE-70">[70]</a>
deflate has a very desirable mix of characteristics: high
reliability, good compression, good encoding speed, excellent decoding speed,
minimal overhead on incompressible data, and modest, well-defined memory
footprints for both encoding and decoding.
</p><blockquote class="footnote">


<a name="INDEX-696.03-new" />	<!-- arithmetic coding -->
<a name="INDEX-696.04-new" />	<!-- bzip2 -->
<a name="FOOTNOTE-70" /><p>[70] Arithmetic coding
has been around for a long time and significantly outperforms deflate; the
relatively recently published Burrows-Wheeler block transform coding
(implemented in <em class="emphasis">bzip2</em>, for example) shows considerable promise as
well; and the patented BitJazz condensation method is likewise quite
impressive.</p>


</blockquote>


<p><a name="INDEX-697" />As an LZ77-derived algorithm, deflate is fundamentally based on the concept
of a <em class="emphasis">sliding window</em>. One begins with the premise that many types of
interesting data, from binary computer instructions to source code to
ordinary text to images, are repetitious to varying degrees. The basic
idea of a sliding window is to imagine a window of some width immediately
preceding the current position in the data stream (and therefore
sliding along as the current position is updated), which one can use as
a kind of dictionary to encode subsequent data. For example, if the text
of this chapter is the data stream, then the current position at the very
instant you read this is <em class="emphasis">here</em>. Preceding that point is a little more
than 13,000 bytes of text, which includes, among other things, six copies of
the fragment ``or example'' (whoa, there's another one!). Instead of encoding
such strings as literal text, deflate replaces each with a pair of numbers
indicating its length (in this case, 10 bytes) and the distance back to one
of the previous instances (perhaps 950 bytes between the fifth and sixth).
The greater the length of the string, the greater the savings in encoding it
as a pointer into the window.
</p>


<p><a name="INDEX-698" />
<a name="INDEX-699" />
<a name="INDEX-700" />There are various ways to implement LZ77; the approach used by deflate is
a ``greedy'' algorithm originally devised by James Storer and Thomas
Szymanski--hence its name, LZSS. LZSS employs a look-ahead buffer and
finds the longest match for the buffer within the sliding window. If the
match exceeds a given threshold length, the string is encoded as a
length/distance pair and the buffer is advanced a corresponding amount. If the
longest match is <em class="emphasis">not</em> sufficiently long, the first character in the
look-ahead buffer is output as a literal value, and the buffer is advanced
by one. Either way, the algorithm continues by seeking the longest match
for the new contents of the buffer.
</p>


<p>The deflate algorithm is actually a bit more clever than the preceding
description would suggest. Rather than simply storing the length/distance
pairs and literal bytes as is, it further compresses the data by
<a name="INDEX-701" />
<a name="INDEX-702" />Huffman-encoding the LZ77 output. This approach is generically
referred to as LZH; deflate's uniqueness lies in its method of combining
literals and lengths into a single Huffman tree, its use of both fixed and
dynamic Huffman codes, and its division of the output stream into blocks
so that regions of incompressible data can be stored as is, rather than
expanding significantly, as can happen with the LZW algorithm.</p>


<p><a name="INDEX-703" />The PNG specification further dictates that the deflate data stream
must conform to the zlib 1.0 format. In particular, the size of the
sliding window must be a power of 2 between 256 bytes and 32
kilobytes, inclusive, and a small zlib header and trailer are
required. The latter includes a 32-bit checksum on the
<em class="emphasis">uncompressed</em> data; recall that the compressed stream is already
covered by PNG's 32-bit CRC value in each IDAT chunk.</p>


<p>More detailed explanation of the deflate algorithm and the zlib data
format is beyond the scope of this book, but the full zlib and deflate
<!--
specifications are included in Appendixes D and E, respectively.
 -->
specifications are available from
<a href="http://www.zlib.org/zlib_docs.html"
>http://www.zlib.org/zlib_docs.html</a>&nbsp;.
In addition, a reference such as
<em class="emphasis">The Data Compression Book</em>, by Mark
Nelson and Jean-loup Gailly, is invaluable for understanding many
compression algorithms, including LZ77 and LZSS.</p>


<a name="INDEX-703.01-new" />
<p>Practically speaking, independent implementation of the deflate algorithm is
both difficult and unnecessary. Almost every PNG implementation available
today makes use of the freely available zlib compression library, and
the examples in Part III, <em class="emphasis">Programming with PNG</em>, do so as well.<a href="#FOOTNOTE-71">[71]</a>
For now I merely note that zlib supports ten compression levels
(including one with no compression at all), differing in the algorithms used
to find matching strings and in the thresholds for terminating the search
prematurely.</p><blockquote class="footnote">


<a name="FOOTNOTE-71" /><p>[71] Nevertheless, at least one alternative (in C++) is available as part of
Colosseum Builders' Image Library, and it is also described in a book by
John Miano, <em class="emphasis">The Programmer's Guide to Compressed Image Files</em>.</p>


</blockquote>


<p>As an aside, note that the efficiency of the compression engine increases
as more data is processed, with peak efficiency being reached when there is
sufficient data to fill the sliding window. This occurs mainly because there
are fewer strings available in the ``dictionary,'' but also, initially, because
those strings that do exist are limited in length--obviously, they cannot be
any longer than the amount of data in the window. Thus, for example, when 25%
of the compressed data has been received, it may correspond to only 20% of the
pixels. But because of data buffering in network protocols and applications,
any large disparities due to the truly low-efficiency encoding at startup will
tend to be washed out at the 512-byte level (or higher). That is, even though
the first 50 bytes might represent only 1% compression, those bytes generally
will not be available until after the 512th byte has been received, by
which point the compression efficiency may have reached 10% or better. And
since this is generally true of most compression algorithms, including those
used by both GIF and PNG, it is reasonable to compare (as I did in <a href="chapter01.html">Chapter 1, "An Introduction to PNG"</a>)
the appearance of the <em class="emphasis">uncompressed</em> pixels at an instant when equal
amounts of <em class="emphasis">compressed</em> data have been received.




</p>


<a name="png.ch09.div.2.1" /><div class="sect2">
<h3 class="sect2">9.2.1. A Final Word on Patents</h3>


<p><a name="INDEX-704" />
<a name="INDEX-705" />As mentioned at the end of 
<a href="chapter07.html">Chapter 7, "History of the Portable Network Graphics Format"</a>, 
Stac has reportedly claimed that the
deflate algorithm is covered by two of their patents. In fact, there
are a number of patents that <em class="emphasis">can</em> be infringed upon by a
compliant deflate implementation, including one held by PKWARE itself
that involves sorted hash tables. But the deflate specification
includes a section on implementing the algorithm without
infringing,<a href="#FOOTNOTE-72">[72]</a>
and, of course, zlib itself follows that prescription. While these
things are never 100% certain unless and until they are tested in court,
developers and users can be reasonably confident that the use of zlib
and its implementation of the deflate algorithm is not subject to licensing
fees.
<a name="INDEX-706" />
<a name="INDEX-707" />
</p><blockquote class="footnote">


<a name="FOOTNOTE-72" /><p>[72] From Section 4 of the deflate specification,
<em class="emphasis">Compression algorithm details</em>:
``...it is strongly recommended that the implementor of a
compressor follow the general algorithm presented here, which is known
not to be patented per se.''</p>


</blockquote>
</div>






</div>
<div class="sect1"><a name="png.ch09.div.3" />
<h2 class="sect1">9.3. Real-World Comparisons</h2>


<p><a name="INDEX-708" />
<a name="INDEX-709" />The only convincing way to demonstrate the compression benefits of one
image format over another is to do an actual comparison of the two on
a set of real images. The problem is choosing the set of
images--what works for one person may not work for another. What
I've done here is to gather together results from a number of
real-world tests performed over the past few years. Readers can
expect to achieve similar results on similar sets of images, but keep
in mind that one can always choose a <em class="emphasis">particular</em> set of images
for which the results will be dramatically different. I'll explain
that remark after we see a few cases.</p>


<p><a name="INDEX-710" />For starters, let's look at a small, very unscientifically chosen set
of images: seven nonanimated GIF images that happened to be the only
ones readily available on my machine one fine day in June 1998.</p>


<a name="png.ch09.tbl.2" />
<div class="table" align="center">

<p>
<table width="502" border="0">
  <tr>
    <td>
      <b class="emphasis-bold">Table 9-2.</b>
      <i>Seven Non-Animated, Non-Scientifically Selected GIF Images</i>
    </td>
  </tr>
</table>
</p>

<p>
<!-- table width="502" border="1" -->
<table border="1">

<tr>
<td align="center"><b class="emphasis-bold">&nbsp;Name&nbsp;</b></td>
<td align="center"><b class="emphasis-bold">&nbsp;GIF Size&nbsp;</b></td>
</tr>

<tr>
<td>&nbsp;linux-penguins&nbsp;</td>
<td align="right">&nbsp;38,280&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux-tinypeng&nbsp;</td>
<td align="right">&nbsp;1,249&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_bigcrash&nbsp;</td>
<td align="right">&nbsp;298,529&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_lgeorges&nbsp;</td>
<td align="right">&nbsp;20,224&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_rasterman&nbsp;</td>
<td align="right">&nbsp;4,584&nbsp;</td>
</tr>

<tr>
<td>&nbsp;sun-tinylogo&nbsp;</td>
<td align="right">&nbsp;1,226&nbsp;</td>
</tr>

<tr>
<td>&nbsp;techweb-scsi-compare&nbsp;</td>
<td align="right">&nbsp;27,660&nbsp;</td>
</tr>

<tr>
<td><b class="emphasis-bold">&nbsp;TOTAL&nbsp;</b></td>
<td align="right"><b class="emphasis-bold">&nbsp;391,752&nbsp;</b></td>
</tr>

</table>
</p>
</div>


<p>The images ranged in size from just over a kilobyte to nearly 300
kilobytes (the exact byte sizes are given in
<a href="#png.ch09.tbl.2">Table 9-2</a>) and
in dimension from 32&nbsp;&times; 32 to 800&nbsp;&times; 600. All but the first
and last were interlaced. When converted to PNG with the gif2png
<a name="INDEX-711" />
<a name="INDEX-712" />utility (<a href="chapter05.html">Chapter 5, "Applications: Image Converters"</a>), preserving
interlacing manually and introducing no new text annotations, things
improved somewhat; <a href="#png.ch09.tbl.3">Table 9-3</a> shows the
preliminary results.</p>


<a name="png.ch09.tbl.3" />
<div class="table" align="center">

<p>
<table width="502" border="0">
  <tr>
    <td>
      <b class="emphasis-bold">Table 9-3.</b>
      <i>Same Seven GIF Images After Conversion to PNG</i>
    </td>
  </tr>
</table>
</p>

<p>
<!-- table width="502" border="1" -->
<table border="1">

<tr>
<td align="center"><b class="emphasis-bold">&nbsp;Name&nbsp;</b></td>
<td align="center"><b class="emphasis-bold">&nbsp;PNG Size&nbsp;</b></td>
<td align="center"><b class="emphasis-bold">&nbsp;Change&nbsp;</b></td>
</tr>

<tr>
<td>&nbsp;linux-penguins&nbsp;</td>
<td align="right">&nbsp;35,224&nbsp;</td>
<td align="right">&nbsp;-8.0%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux-tinypeng&nbsp;</td>
<td align="right">&nbsp;722&nbsp;</td>
<td align="right">&nbsp;-42.2%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_bigcrash&nbsp;</td>
<td align="right">&nbsp;283,839&nbsp;</td>
<td align="right">&nbsp;-4.9%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_lgeorges&nbsp;</td>
<td align="right">&nbsp;20,476&nbsp;</td>
<td align="right">&nbsp;+1.2%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_rasterman&nbsp;</td>
<td align="right">&nbsp;4,812&nbsp;</td>
<td align="right">&nbsp;+5.0%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;sun-tinylogo&nbsp;</td>
<td align="right">&nbsp;566&nbsp;</td>
<td align="right">&nbsp;-53.8%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;techweb-scsi-compare&nbsp;</td>
<td align="right">&nbsp;20,704&nbsp;</td>
<td align="right">&nbsp;-25.1%&nbsp;</td>
</tr>

<tr>
<td><b class="emphasis-bold">&nbsp;TOTAL&nbsp;</b></td>
<td align="right"><b class="emphasis-bold">&nbsp;366,343&nbsp;</b></td>
<td align="right"><b class="emphasis-bold">&nbsp;-6.5%&nbsp;</b></td>
</tr>

</table>
</p>
</div>


<a name="INDEX-712.01-new" />
<p>Five of the images shrank when converted to PNG--three of them quite
significantly--while two grew. Overall, the set achieved a 6.5% improvement
in byte size. Since gif2png uses the standard settings of the
libpng reference code,<a href="#FOOTNOTE-73">[73]</a>
its results may be considered typical of ``good'' PNG encoders. But the owner
of a web site will often be willing to spend a little more time up front on
compression in return for additional bandwidth savings in the long run. That's
<a name="INDEX-713" />
<a name="INDEX-714" />
where pngcrush (also discussed in
<a href="chapter05.html">Chapter 5, "Applications: Image Converters"</a>)
comes in. <a href="#png.ch09.tbl.4">Table 9-4</a> shows its results;
the percentages are again relative to the original GIF file sizes.</p>


<a name="png.ch09.tbl.4" />
<div class="table" align="center">

<p>
<table width="502" border="0">
  <tr>
    <td>
      <b class="emphasis-bold">Table 9-4.</b>
      <i>Same Seven GIF Images After PNG Conversion and Optimization</i>
    </td>
  </tr>
</table>
</p>

<p>
<!-- table width="502" border="1" -->
<table border="1">

<tr>
<td align="center"><b class="emphasis-bold">&nbsp;Name&nbsp;</b></td>
<td align="center"><b class="emphasis-bold">&nbsp;Optimized&nbsp;<br />&nbsp;PNG Size&nbsp;</b></td>
<td align="center"><b class="emphasis-bold">&nbsp;Change&nbsp;</b></td>
</tr>

<tr>
<td>&nbsp;linux-penguins&nbsp;</td>
<td align="right">&nbsp;34,546&nbsp;</td>
<td align="right">&nbsp;-9.8%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux-tinypeng&nbsp;</td>
<td align="right">&nbsp;710&nbsp;</td>
<td align="right">&nbsp;-43.2%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_bigcrash&nbsp;</td>
<td align="right">&nbsp;282,948&nbsp;</td>
<td align="right">&nbsp;-5.2%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_lgeorges&nbsp;</td>
<td align="right">&nbsp;19,898&nbsp;</td>
<td align="right">&nbsp;-1.6%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;linux_rasterman&nbsp;</td>
<td align="right">&nbsp;4,731&nbsp;</td>
<td align="right">&nbsp;+3.2%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;sun-tinylogo&nbsp;</td>
<td align="right">&nbsp;550&nbsp;</td>
<td align="right">&nbsp;-55.1%&nbsp;</td>
</tr>

<tr>
<td>&nbsp;techweb-scsi-compare&nbsp;</td>
<td align="right">&nbsp;19,155&nbsp;</td>
<td align="right">&nbsp;-30.7%&nbsp;</td>
</tr>

<tr>
<td><b class="emphasis-bold">&nbsp;TOTAL&nbsp;</b></td>
<td align="right"><b class="emphasis-bold">&nbsp;362,538&nbsp;</b></td>
<td align="right"><b class="emphasis-bold">&nbsp;-7.5%&nbsp;</b></td>
</tr>

</table>
</p>
</div>

<blockquote class="footnote">
<a name="FOOTNOTE-73" />
<p>[73] libpng is discussed at length in
<a href="chapter13.html">Chapter 13, "Reading PNG Images"</a>,
<a href="chapter14.html">Chapter 14, "Reading PNG Images Progressively"</a> and
<a href="chapter15.html">Chapter 15, "Writing PNG Images"</a>, which
demonstrate how to use libpng to read and write PNG images.</p>
</blockquote>


<p>So we see that the current state-of-the-art PNG encoder ekes out
another percentage point in the total size, with all but one of the
images now smaller than the original. That lone holdout is worth a
closer look in this case. I already noted that
<em class="emphasis">linux_rasterman.gif</em> was one of the interlaced images; suppose it
had not been? The noninterlaced GIF version is 4,568 bytes, only 16
bytes smaller than the original. But the noninterlaced PNG version is
either 4,067 bytes (gif2png) or 4,000 bytes (pngcrush)--a savings of
11.0% or 12.4% over the noninterlaced GIF. In other words, PNG's
two-dimensional interlacing scheme can have a <em class="emphasis">significant negative
impact on compression</em>, particularly for small images. This is an
important point to consider when creating images: is interlacing
really needed for a 152&nbsp;&times; 96 image (as in this case) when the
penalty is more than 18% of the file size?</p>


<p>This example may have been instructive, but seven images do not constitute a
statistically valid sample.<a href="#FOOTNOTE-74">[74]</a>
<a name="INDEX-715" />
<a name="INDEX-716" />
<a name="INDEX-717" />
So let's consider a few more data sets. One real-life example comes from
the course entitled ``Authoring Compelling and Efficient VRML 2.0 Worlds''
at the VRML98 conference in Monterey, California. Though the content of the
course was otherwise outstanding, one slide comparing image formats for 3D
textures was rather alarming from a PNG perspective. It showed the information
displayed in <a href="#png.ch09.tbl.5">Table 9-5</a>,
together with the textures themselves (which are omitted here):</p>


<a name="png.ch09.tbl.5" />
<div class="table" align="center">

<p>
<table width="502" border="0">
  <tr>
    <td>
      <b class="emphasis-bold">Table 9-5.</b>
      <i>Original PNG, GIF, and JPEG Comparison from VRML98 Course</i>
    </td>
  </tr>
</table>
</p>

<p>
<table width="502" border="1">

<tr>
<td align="middle"><b class="emphasis-bold">Name</b></td>
<td align="middle"><b class="emphasis-bold">Dimensions</b></td>
<td align="middle"><b class="emphasis-bold">Type</b></td>
<td align="middle"><b class="emphasis-bold">JPEG Size</b></td>
<td align="middle"><b class="emphasis-bold">GIF Size</b></td>
<td align="middle"><b class="emphasis-bold">PNG Size</b></td>
</tr>

<tr>
<td>&nbsp;linoleum1&nbsp;</td>
<td align="center">128&nbsp;&times; 128</td>
<td align="center">grayscale</td>
<td align="right">10,956</td>
<td align="right">7,055</td>
<td align="right">16,008</td>
</tr>

<tr>
<td>&nbsp;doggie&nbsp;</td>
<td align="center">128&nbsp;&times; 256</td>
<td align="center">color</td>
<td align="right">9,897</td>
<td align="right">24,605</td>
<td align="right">89,022</td>
</tr>

<tr>
<td>&nbsp;fog&nbsp;</td>
<td align="center">128&nbsp;&times; 128</td>
<td align="center">grayscale + alpha</td>
<td align="right">--</td>
<td align="right">--</td>
<td align="right">26,732</td>
</tr>

<tr>
<td>&nbsp;circlefade&nbsp;</td>
<td align="center">128&nbsp;&times; 128</td>
<td align="center">grayscale + alpha</td>
<td align="right">--</td>
<td align="right">--</td>
<td align="right">15,735</td>
</tr>

<tr>
<td>&nbsp;buttfly&nbsp;</td>
<td align="center">128&nbsp;&times; 128</td>
<td align="center">color + transparency</td>
<td align="right">--</td>
<td align="right">4,367</td>
<td align="right">--</td>
</tr>

<?x-space 5p?>
</table>
</p>
</div>

<blockquote class="footnote">
<a name="FOOTNOTE-74" /><p>[74] That would be a small understatement.</p>
</blockquote>


<p>Even with no more details than are shown here, at least one problem is
apparent: in row 2, the JPEG image is 24 bits deep, while the GIF is only
8 bits. Judging by the size of the corresponding PNG, one might assume
(correctly) that the PNG is also 24 bits. So on the one hand, PNG is being
compared with an image of the same depth that uses lossy compression, while
on the other it is being compared with an image only one-third as deep, which
also amounts to lossy compression. That hurts.</p>


<p>As it turned out, there were other problems: the PNG images were created
with an image editor not known for its compression capabilities, and some
of the PNGs were interlaced even though their GIF counterparts were not.
(And since this was a VRML course, I should note that no VRML browser in
existence actually uses interlacing to render textures progressively, so
there is generally no point in creating such images.) The upshot is that
all of these factors--JPEG's lossy compression, mixing 24-bit and 8-bit
images, mixing interlaced and noninterlaced images, and using a particularly
poor encoder to compress the PNGs--worked against our favorite image format.</p>


<p>After evening the playing field by using the GIFs as the source images for
the PNGs, turning off interlacing, and using a combination of conversion and
encoding tools (including pngcrush), the results were considerably
better for PNG, as shown in 
<a href="#png.ch09.tbl.6">Table 9-6</a>. 


</p>


<a name="png.ch09.tbl.6" />
<div class="table" align="center">

<p>
<table width="502" border="0">
  <tr>
    <td>
      <b class="emphasis-bold">Table 9-6.</b>
      <i>Updated PNG, GIF, and JPEG Comparison for VRML98 Course Images</i>
    </td>
  </tr>
</table>
</p>

<p>
<table width="502" border="1">

<tr>
<td align="middle"><b class="emphasis-bold">Name</b></td>
<td align="middle"><b class="emphasis-bold">JPEG Size</b></td>
<td align="middle"><b class="emphasis-bold">GIF Size</b></td>
<td align="middle"><b class="emphasis-bold">Original<br />PNG Size</b></td>
<td align="middle"><b class="emphasis-bold">Optimized<br />PNG Size</b></td>
<td align="middle"><b class="emphasis-bold">PNG<br />Change</b></td>
</tr>

<tr>
<td>linoleum1</td>
<td align="right">10,956</td>
<td align="right">7,055</td>
<td align="right">16,008</td>
<td align="right"><b class="emphasis-bold">6,753</b></td>
<td align="right">-57.8%</td>
</tr>

<tr>
<td>doggie</td>
<td align="right"><b class="emphasis-bold">9,897</b></td>
<td align="right">24,605</td>
<td align="right">89,022</td>
<td align="right">22,555</td>
<td align="right">-74.7%</td>
</tr>

<tr>
<td>fog</td>
<td align="right">--</td>
<td align="right">--</td>
<td align="right">26,732</td>
<td align="right"><b class="emphasis-bold">16,221</b></td>
<td align="right">-39.3%</td>
</tr>

<tr>
<td>circlefade</td>
<td align="right">--</td>
<td align="right">--</td>
<td align="right">15,735</td>
<td align="right"><b class="emphasis-bold">6,638</b></td>
<td align="right">-57.8%</td>
</tr>

<tr>
<td>buttfly</td>
<td align="right">--</td>
<td align="right">4,367</td>
<td align="right">--</td>
<td align="right"><b class="emphasis-bold">3,965</b></td>
<td align="right">--</td>
</tr>

<?x-space 5p?>
</table>
</p>
</div>


<p><a name="INDEX-718" />Here, I've marked the smallest version of each image
in boldface type; the only one that isn't a PNG is the color JPEG, which is
hardly surprising.  What is interesting is that the grayscale JPEG
(<em class="emphasis">linoleum1.jpg</em>) is
larger than both the GIF and optimized PNG versions, despite the presumed
benefits of lossy compression. There are at least three reasons for this.
First, GIF and PNG both get an automatic factor-of-three savings from the fact
that each pixel is only 1 byte deep instead of 3 bytes. Second, JPEG
is at a relative disadvantage when dealing with grayscale images, because most
of its compression benefits arise from how it treats the color components
of an image. Third, this particular image is more artificial than natural,
with quite a few relatively sharp features, which makes it particularly
ill suited to JPEG-style compression.</p>


<p>But perhaps the most striking feature of 
<a href="#png.ch09.tbl.6">Table 9-6</a> is just how poorly the original
encoder did on its PNG
images. Realizable savings of 40% to 75% are unusual, but thanks to
poor encoding software, they are not as unusual as one might hope.</p>


<p>As another real example (but one that is perhaps more representative
of what a typical web site might expect), the owner of
<a href="http://www.feynman.com/">http://www.feynman.com/</a> found that
when he converted 54
nonanimated GIFs to PNGs, the collection grew in size from 270,431
bytes to 327,590 bytes. Insofar as all of the original images had
depths of 8 bits or less--and even the worst PNG encoder will, on
average, do as well or better than GIF on colormapped PNG
images--the most likely explanation for the 21% increase in size is
that the conversion utility produced 24-bit RGB PNGs. Indeed, the
owner indicated that he had used ImageMagick's convert utility, older
versions of which reportedly had the unfortunate habit of creating
24-bit PNGs unless explicitly given the <b class="emphasis-bold">-depth 8</b>
option. (This problem seems to have been fixed in more recent versions,
but even current versions include 160 bytes' worth of text and background
chunks per image.) When the original GIFs were converted to PNG with gif2png
instead, the total size dropped to 215,668 bytes, for a 20% overall
savings over GIF. Individually, the GIFs were smaller in 15 of the 54
cases, but never by more than 340 bytes. Of the 39 images in which
the PNG version was smaller, one-third of them differed by more than a
kilobyte, and one was 14&nbsp;KB smaller.</p>


<p><a name="INDEX-719" />For the last GIF comparison, I downloaded the World Wide Web
Consortium's icon collection, consisting of 448 noncorrupted GIF
images. Of these, 43 had embedded text comments and 39 were
interlaced. Most of the images were icon-sized (as would be
expected), at 64&nbsp;&times; 64 or smaller, but there were a handful of
larger images, too. The total size of the files was 1,810,239 bytes.
Conversion to PNG via <em class="emphasis">gif2png</em>, handling the interlaced and
noninterlaced images separately in order to preserve their status,
resulted in a total PNG size of 1,587,337 bytes, or a 12.3% reduction.
Additional compression via pngcrush resulted in a total of 1,554,965
bytes, or a 14.1% reduction (relative to the GIF size). Out of the
448 images, PNG won the size comparison in 285 cases, lost in 161
cases, and tied in 2 cases. As in the previous test, however, the
magnitude of the differences was the critical factor: GIF won by more
than a kilobyte in only 1 case, while PNG won by that amount in 37
cases--4 of which were greater than 10&nbsp;KB, 1 more than 64&nbsp;KB. The
average difference for the 285 cases in which PNG was smaller was 940
bytes; for the 161 GIF cases, it was a mere 78 bytes.
</p>


<p><a name="INDEX-720" />
<a name="INDEX-721" />Finally, I've mentioned an upcoming JPEG standard for lossless compression
a couple of times; it's worth a quick look, too. JPEG-LS, as the standard
will be known,<a href="#FOOTNOTE-75">[75]</a>
is based on Hewlett-Packard's LOCO-I algorithm. As this is written, it
is implemented in version 0.90 of HP's locoe encoder, available only
in binary form for the HP-UX, Solaris, IRIX, and 32-bit Windows
platforms. (An independent implementation is available as C source
code from the University of British Columbia.) In a comparison
<a name="INDEX-722" />performed by Adam Costello, the HP encoder was tested against pnmtopng
and pngcrush on the eight standard color images in the Waterloo
<?x-need 10?>BragZone's ColorSet.  pnmtopng is of interest only for speed reasons;
even though it is moderately fast, locoe was considerably faster. I
have omitted its size results from the comparison since, as expected,
pngcrush outperformed it in all cases, though at a considerable cost
in speed.
</p><blockquote class="footnote">


<a name="FOOTNOTE-75" /><p>[75] In December 1998 it became an ISO Draft International Standard, the final
voting stage before becoming a full International Standard. It will
officially be known as ISO/IEC 14495-1 upon approval. It has already been
approved as International Telecommunication Union (ITU) Recommendation T.87.</p>


</blockquote>


<p>The results were fascinating. In the five test images categorized by the
University of Waterloo as ``natural,'' JPEG-LS beat PNG by between 5% and
11%--not a huge difference, but certainly significant. However, in the
three images marked ``artistic,'' PNG proved superior by wide margins, with
one image more than three times smaller than the corresponding JPEG-LS version.
These results are summarized in 
<a href="#png.ch09.tbl.7">Table 9-7</a>; once again, the byte size of the
winning format for each image is highlighted in boldface type.
</p>


<a name="png.ch09.tbl.7" />
<div class="table" align="center">

<p>
<table width="502" border="0">
  <tr>
    <td>
      <b class="emphasis-bold">Table 9-7.</b>
      <i>PNG and JPEG-LS Comparison for Waterloo BragZone Color Images</i>
    </td>
  </tr>
</table>
</p>

<p>
<table width="502" border="1">

<tr>
<td align="center"><b class="emphasis-bold">Classification</b></td>
<td align="center"><b class="emphasis-bold">Name</b></td>
<td align="center"><b class="emphasis-bold">Total<br />Pixels</b></td>
<td align="center"><b class="emphasis-bold">JPEG-LS<br />Size</b></td>
<td align="center"><b class="emphasis-bold">PNG<br />IDAT Size</b></td>
<td align="center"><b class="emphasis-bold">Relative<br />Difference</b></td>
</tr>

<tr>
<td align="center" rowspan=5>``natural''</td>
<td>lena</td>
<td align="right">262,144</td>
<td align="right"><b class="emphasis-bold">445,799</b></td>
<td align="right">475,430</td>
<td align="right">+6.6%</td>
</tr>

<tr>
<td>monarch</td>
<td align="right">393,216</td>
<td align="right"><b class="emphasis-bold">555,012</b></td>
<td align="right">615,260</td>
<td align="right">+10.9%</td>
</tr>

<tr>
<td>peppers</td>
<td align="right">262,144</td>
<td align="right"><b class="emphasis-bold">385,047</b></td>
<td align="right">425,560</td>
<td align="right">+10.5%</td>
</tr>

<tr>
<td>sail</td>
<td align="right">393,216</td>
<td align="right"><b class="emphasis-bold">767,374</b></td>
<td align="right">808,606</td>
<td align="right">+5.4%</td>
</tr>

<tr>
<td>tulips</td>
<td align="right">393,216</td>
<td align="right"><b class="emphasis-bold">616,536</b></td>
<td align="right">680,881</td>
<td align="right">+10.4%</td>
</tr>

<?x-space 2p?><?x-space 2p?><tr>
<td align="center" rowspan=3>``artistic''</td>
<td>clegg</td>
<td align="right">716,320</td>
<td align="right">653,299</td>
<td align="right"><b class="emphasis-bold">484,589</b></td>
<td align="right">-25.8%</td>
</tr>

<tr>
<td>frymire</td>
<td align="right">1,235,390</td>
<td align="right">935,285</td>
<td align="right"><b class="emphasis-bold">251,865</b></td>
<td align="right">-73.1%</td>
</tr>

<tr>
<td>serrano</td>
<td align="right">499,426</td>
<td align="right">293,532</td>
<td align="right"><b class="emphasis-bold">106,765</b></td>
<td align="right">-63.6%</td>
</tr>

<?x-space 5p?>
</table>
</p>
</div>




<p>Note that in the final column I used the JPEG-LS size as the
reference, which effectively works against PNG--had I used PNG
instead, the <em class="emphasis">frymire</em> image, for example, would show JPEG-LS as
271.3% larger, which looks much more impressive! Also note that I
used the size of the PNG IDAT data for comparison rather than the
actual PNG file size; this was done because locoe appears to encode
raw JPEG data, with none of the overhead of standard JPEG file formats
like JFIF and SPIFF.</p>


<p>In any case, the results are only slightly more statistically valid than
the first comparison of GIF images was. Eight samples, even if they are a
carefully chosen set of standard research images, cannot tell the full
story. And results as intriguing as these certainly deserve more extensive
testing, which will no doubt happen in due course.










<a name="INDEX-723" />
<a name="INDEX-724" />
</p>


















</div>
<div class="sect1"><a name="png.ch09.div.4" />
<h2 class="sect1">9.4. Practical Compression Tips</h2>


<p><a name="INDEX-725" />
<a name="INDEX-726" />I could hardly end this chapter without some practical pointers on optimizing
PNG compression, both for users and for programmers. Herewith are some
rough guidelines, arranged in descending order of effectiveness. Note that,
as with any set of rules, there will always be exceptions.</p>


<a name="png.ch09.div.4.1" /><div class="sect2">
<h3 class="sect2">9.4.1. Tips for Users</h3>


<p>Following is a list of tips for users of PNG-supporting software:</p>


<dl>
<dt><b><em class="emphasis">Use the correct image format</em></b></dt><dd><p><a name="INDEX-727" />
<a name="INDEX-728" />
<a name="INDEX-729" />If you have photographic images and their quality as JPEGs is
acceptable, use JPEG! JPEG will almost always be smaller than PNG,
especially for color images. Conversely, if you have images with just
a few colors and/or sharp edges (such as text and simple graphics),
JPEG is almost never the correct solution; use PNG or GIF instead.
For binary transparency, also use PNG or GIF; for partial transparency
or lossless RGB, use PNG or TIFF; for animations, use MNG or GIF.</p></dd>



<dt><b><em class="emphasis">Use the correct pixel depth</em></b></dt><dd><p><a name="INDEX-730" />
<a name="INDEX-731" />
<a name="INDEX-732" />For example, don't convert a GIF (which, from a practical perspective,
always has a depth of 8 bits or less) to a 24-bit PNG; that will
automatically boost the file size by a factor of three. Similarly, if
given the option, don't save a grayscale image as RGB; save it as
grayscale or, at worst, as a palette-based PNG. Likewise, don't use a
full alpha channel if single-color transparency (&agrave; la GIF) would
suffice; it doubles the size of grayscale images and adds 33% to the
size of RGB.</p></dd>



<dt><b><em class="emphasis">Corollary: Quantize and dither truecolor images to a palette if quality is acceptable</em></b></dt><dd><p><a name="INDEX-733" />
<a name="INDEX-734" />
<a name="INDEX-735" />
<a name="INDEX-736" />
<a name="INDEX-737" />Likewise, quantize and dither RGBA or gray+alpha
PNGs to a palette, if possible. This is something that only you, the user, can
judge; no reasonable image application will ever quantize (which is a lossy
transformation) unless instructed to do so by you. This is not an issue for
GIF, which realistically supports <em class="emphasis">only</em> colormapped images (i.e., your
choice of GIF as an output format amounts to an explicit instruction to
quantize) nor is it an issue for JPEG, which supports only grayscale and
truecolor. Only PNG supports colormapped, grayscale, and truecolor images, as
well as alpha channels.</p></dd>



<dt><b><em class="emphasis">Use interlacing with care</em></b></dt><dd><p><a name="INDEX-738" />Interlacing is a way to transmit the useful
parts of an image more quickly, particularly on the Web, so that the end user 
can click on a hot-linked region before the image is fully downloaded, if she
so chooses. But as I saw earlier, PNG's two-dimensional interlacing scheme can
degrade compression by 15% in some cases, especially for small images. Since
small images are transmitted over the network fairly quickly anyway, they
usually do not need to be interlaced.</p></dd>



<dt><b><em class="emphasis">Use the correct tools</em></b></dt><dd><p>In the first six chapters, I discussed a number of PNG-supporting
applications and noted their limitations wherever possible; use that
as a guide when choosing your tools, assuming you have a choice. Even
if your program generally compresses PNG images well, consider using
an optimizer such as pngcrush on everything when you're done;<a href="#FOOTNOTE-76">[76]</a>
definitely do so if your program is not known for its compression
performance. For converting GIFs to PNGs, the dedicated gif2png is
the most capable solution, even given its permanently beta version
number; it preserves both transparency and embedded text comments.</p><blockquote class="footnote">


<a name="FOOTNOTE-76" /><p>[76] It is one of my favorite tools, in case that wasn't already
apparent. As of April 1999, there are still a few optimization tricks
it doesn't do, but its author is addressing those even as this is
written.</p>


</blockquote></dd>



<dt><b><em class="emphasis">Don't include unnecessary information</em></b></dt><dd><p>A lengthy copyright message or
other text can add 100 bytes or more, which is a lot for icons and other small
images.</p></dd>

</dl>
</div>








<a name="png.ch09.div.4.2" /><div class="sect2">
<h3 class="sect2">9.4.2. Tips for Programmers</h3>


<p>Following is a list of tips for programmers:</p>


<dl>
<dt><b><em class="emphasis">Use the correct pixel depth</em></b></dt><dd><p><a name="INDEX-739" />
<a name="INDEX-740" />
<a name="INDEX-741" />Count colors! Or at least do so
when the compression setting is ``best'' and you don't know that the image is
grayscale--it doesn't take that long, and computers are good
at that sort of thing. If there are 256 or fewer colors, write a colormapped
image; doing so will translate to a factor-of-three savings in the PNG file
size relative to an RGB image.</p></dd>



<dt><b><em class="emphasis">Use the correct pixel depth II</em></b></dt><dd><p>If the image is colormapped, don't
assume that the pixels must be 8 bits deep. If there are only one or two
colors, write a 1-bit image. If there are three or four colors, write a
2-bit image. If there are between 5 and 16 colors, write a 4-bit image.
(These are the only useful cases for PNG.) The compression engine cannot
compensate for bloated pixels! Choosing the correct depth for a palette-based
image will reduce the file size by a factor of anywhere from two to eight
relative to an 8-bit image.</p></dd>



<dt><b><em class="emphasis">Use grayscale if possible</em></b></dt><dd><p><a name="INDEX-742" />If you know the image is gray, see if it can be written more compactly
as a grayscale PNG than as a colormapped PNG--this is automatically
true if there are more than 16 shades of gray. Doing so will save up
to 780 bytes by eliminating the palette. But don't assume that 16 or
fewer shades automatically means the image can be written as 4-bit (or
smaller) grayscale. Grayscale necessarily implies that the shades are
evenly distributed from black to white. If, for example, the 16 shades
are bunched up in one part of the gray spectrum, the image must be
written as 8-bit grayscale or 4-bit palette-based. For larger images,
the palette-based approach is almost certainly better; for small ones
it depends, but the 8-bit grayscale case may end up being smaller. Try
both, if possible; it's very fast for small images.</p></dd>



<a name="INDEX-742.01-new" />
<dt><b><em class="emphasis">Set the compression and filtering options intelligently</em></b></dt><dd><p>For programs that use libpng (discussed at length in <a href="part3.html">Part III, "Programming with PNG"</a>), this
is not a serious issue; it will automatically do the right thing if
left to itself. But if you are writing custom PNG code, follow the
guidelines in the PNG specification for matching filter strategies
with image types. In particular, use filter type None for colormapped
images and for grayscale images less than 8 bits deep. Use adaptive
filtering (the ``minimum sum of absolute differences'' heuristic) for
all other cases.</p></dd>



<dt><b><em class="emphasis">Truncate the palette</em></b></dt><dd><p>Unlike GIF, PNG's palette size is determined by
the chunk size, so there is no need to include 256 entries if only 173 are
used in the image. At 3 bytes per entry, wasted slots can make a big
difference in icons and other small images.</p></dd>



<dt><b><em class="emphasis">Truncate the transparency chunk</em></b></dt><dd><p><a name="INDEX-743" />
<a name="INDEX-744" />It is extremely rare for every palette entry to be partially or fully
transparent. If there are any opaque entries--in particular, if all
but one are opaque--reorder the palette so that the opaque entries
are at the end. The transparency entries corresponding to these opaque
colors can then be omitted. The absolute worst possible approach is to
put the single transparent entry at the <em class="emphasis">end</em> of the palette!
Those 255 extra bytes are a lot for a file that would otherwise be 500
(or even 150) bytes long.</p></dd>



<dt><b><em class="emphasis">Do transparency intelligently</em></b></dt><dd><p>Understand how PNG's alpha channels and tRNS chunk work. If the alpha
mask is binary (that is, either fully transparent or fully opaque),
see if the transparent parts correspond to a single color or gray
shade; if so, eliminate the alpha channel from the PNG file and use
<a name="INDEX-745" />the tRNS chunk (``cheap transparency'') instead. Alternatively, see
if the total number of color+alpha combinations is 256 or fewer; if
so, write a colormapped image with a tRNS chunk. If the user requests
that an RGBA image be converted to indexed color, do so intelligently.
The combination of PNG's PLTE and tRNS chunks amounts to a palette
whose entries are RGBA values. The exact same algorithms that
quantize and dither a 24-bit RGB image down to an 8-bit palette-based
image can be used to quantize and dither a 32-bit RGBA or 16-bit
grayscale+alpha image down to an 8-bit RGBA palette. In particular,
you cannot treat color values and transparency values as if they are
separate, unrelated entities; attempting to partition the palette into
a ``color part'' and a ``transparent part'' makes no more sense than
attempting to partition a standard RGB palette into red, green, and
blue parts. If you do cheap transparency poorly, the user will be
forced to use a full alpha channel, quadrupling her file size. For
grayscale, an alpha channel ``merely'' doubles the size. Note that
the icicle image in <a href="fig_C1.html">Figure C-1</a> in the color insert is
actually colormapped. Aside from the garish background--which was actually
generated by the viewing application--the full-resolution half looks pretty
darned good, doesn't it?</p></dd>



<dt><b><em class="emphasis">Don't include unnecessary chunks in small images</em></b></dt><dd><p>Gamma information (or
<a name="INDEX-746" />the sRGB chunk) is always good, but a full ICC profile may quadruple the size
of a small image file. Consider not including a Software text chunk or tIME
chunk, or do so only for images larger than, say, 100&nbsp;&times; 100 pixels.
Include dots-per-inch information (pHYs chunk) only if it is actually relevant
to the image; but the user may be the only one who can make that call.</p></dd>



<dt><b><em class="emphasis">Offer the user reasonable options</em></b></dt><dd><p>Don't overwhelm him with unnecessary detail about filters or other
technical jargon. For example, offer a simple checkbox to turn on
interlacing. Offer a simple dial or even just two or three choices
for compression level--<em class="emphasis">fastest</em>, <em class="emphasis">typical</em>, and <em class="emphasis">best,</em>
perhaps. Even though it will make the file bigger, offer to include
at least a few text annotations--Author, Title, Description, and/or
Copyright, for example. On the other hand, offer to omit certain
optional information, such as that described in the previous item.</p></dd>



<dt><b><em class="emphasis">Warn the user about data loss</em></b></dt><dd><p><a name="INDEX-747" />If a region is completely transparent,
don't zero out the underlying color pixels in order to improve compression
unless you've notified the user in some way. Make sure she understands that
quantization and dithering are lossy transformations, but don't make this an
overly scary issue.
<a name="INDEX-748" />
<a name="INDEX-749" />
<a name="INDEX-750" />
</p></dd>

</dl>
</div>

<pre>










































</pre>


<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->

<a href="chapter08.html"><img width=24 height=13 border=0 align="left"
 src="images/prev.png" alt="&lt;-"></a>

<a href="chapter10.html"><img width=24 height=13 border=0 align="right"
 src="images/next.png" alt="-&gt;"></a>

<div align="center">
  <a href="chapter08.html"><font size="-1" color="#000000"
   ><b>PREVIOUS</b></font></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a
     href="toc.html"><font size="-1" color="#000000"
   ><b>CONTENTS</b></font></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a
     href="chapter10.html"><font size="-1" color="#000000"
   ><b>NEXT</b></font></a>
</div>

<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->



</body></html>