/usr/share/doc/png-definitive-guide/html/chapter08.html is in png-definitive-guide 20060430-1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<HEAD>
<TITLE>PNG Basics (PNG: The Definitive Guide)</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<!-- http://www.w3.org/TR/REC-CSS2/box.html -->
<STYLE TYPE="text/css">
P { margin-bottom: 0em }
UL {
margin-bottom: 0em;
margin-top: 0em;
list-style: disc;
}
LI {
padding: 0px 0px 0px 0px;
margin: 0px 0px 0px 0px;
}
</STYLE>
<LINK REV="made" HREF="http://pobox.com/~newt/greg_contact.html">
<!-- Copyright (c) 1999 O'Reilly and Associates. -->
<!-- Copyright (c) 2002-2006 Greg Roelofs. -->
</HEAD>
<body bgcolor="#ffffff" text="#000000">
<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
<a href="chapter07.html"><img width=24 height=13 border=0 align="left"
src="images/prev.png" alt="<-"></a>
<a href="chapter09.html"><img width=24 height=13 border=0 align="right"
src="images/next.png" alt="->"></a>
<div align="center">
<a href="chapter07.html"><font size="-1" color="#000000"
><b>PREVIOUS</b></font></a> <a
href="toc.html"><font size="-1" color="#000000"
><b>CONTENTS</b></font></a> <a
href="chapter09.html"><font size="-1" color="#000000"
><b>NEXT</b></font></a>
</div>
<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
<h1 class="chapter">Chapter 8. PNG Basics</h1>
<div class="htmltoc"><h4 class="tochead">Contents:</h4><p>
<a href="#png.ch08.div.1">8.1. Chunks</a><br />
<a href="#png.ch08.div.2">8.2. PNG Signature</a><br />
<a href="#png.ch08.div.3">8.3. A Word on Color Representation</a><br />
<a href="#png.ch08.div.4">8.4. The Simplest PNG</a><br />
<a href="#png.ch08.div.5">8.5. PNG Image Types</a><br />
<a href="#png.ch08.div.5.1">8.5.1. Palette-Based</a><br />
<a href="#png.ch08.div.5.2">8.5.2. Palette-Based with Transparency</a><br />
<a href="#png.ch08.div.5.3">8.5.3. Grayscale</a><br />
<a href="#png.ch08.div.5.4">8.5.4. Grayscale with Transparency</a><br />
<a href="#png.ch08.div.5.5">8.5.5. Grayscale with Alpha Channel</a><br />
<a href="#png.ch08.div.5.6">8.5.6. RGB</a><br />
<a href="#png.ch08.div.5.7">8.5.7. RGB with Transparency</a><br />
<a href="#png.ch08.div.5.8">8.5.8. RGB with Alpha Channel</a><br />
<a href="#png.ch08.div.6">8.6. Interlacing and Progressive Display</a><br />
</p></div>
<p>The fundamental building block of PNG images is the <em class="emphasis">chunk</em>. With
the exception of the first 8 bytes in the file (and we'll come back to
those shortly), a PNG image consists of nothing but chunks.
<a name="INDEX-567" /></p>
<div class="sect1"><a name="png.ch08.div.1" />
<h2 class="sect1">8.1. Chunks</h2>
<p><a name="INDEX-568" />
<a name="INDEX-569" />Chunks were designed to be easily tested and manipulated by computer
programs, easily detected by human eyes, and reasonably self-contained.
Every chunk has the same structure: a 4-byte length (in ``big-endian''
format, as with all integer values in PNG streams), a 4-byte chunk
type, between 0 and 2,147,483,647 bytes of chunk data, and a 4-byte
<em class="emphasis">cyclic redundancy check</em> value (CRC). This is diagrammed in
<a href="#png.ch08.fig.1">Figure 8-1</a>.
</p>
<a name="png.ch08.fig.1" />
<div class="figure" align="center">
<p>
<table width=502>
<tr>
<td>
<img width=502 height=124 border=0
src="figs/png.0801.png" alt="Figure 8-1" /><br>
<br>
<b>Figure 8-1:</b> <i>PNG chunk structure.</i>
</td>
</tr>
</table>
</p>
</div>
<p>The data field is straightforward; that's where the interesting bits (if
any) go; specific content will be discussed later, as each chunk is described.
<a name="INDEX-570" />
<a name="INDEX-571" />The length field refers to the length of the
data field alone, not the chunk type or CRC. The CRC, on the other hand,
covers both the chunk-type field and the chunk data and is always present,
even when there is no chunk data. Note that the combination of length
fields and CRC values is already sufficient to check the basic integrity
of a PNG file! The only missing information--not including the contents
of the first 8 bytes in the file--is the exact algorithm (or
``polynomial'') used for the CRC. That turns out to be identical to the
CRC used by gzip and many popular archiving programs; it is described in
detail in Section 3.4 of the <em class="emphasis">PNG Specification, Version 1.1</em>,
available from <a href="http://www.libpng.org/pub/png/pngdocs.html">http://www.libpng.org/pub/png/pngdocs.html</a>.</p>
<p>The chunk type is possibly the most unusual feature. It is specified
as a sequence of binary values, which just happen to correspond to
the upper- and lowercase ASCII letters used on virtually every computer in
the Western, non-mainframe world. Since it is far more convenient (and
readable) to speak in terms of text characters than numerical sequences, the
remainder of this book will adopt the convention of referring to chunks by
their ASCII names. Programmers of EBCDIC-based computers should take note of
this and remember to use only the numerical values corresponding to the ASCII
characters.</p>
<p>Chunk types (or names) are usually mnemonic, as in the case of the IHDR
or <em class="emphasis">image header</em> chunk. In addition, however, each character in the
name encodes a single bit of information that shows up in the capitalization
of the character.<a href="#FOOTNOTE-56">[56]</a>
Thus IHDR and iHDR are two completely different
chunk types, and a decoder that encounters an unrecognized chunk can
nevertheless infer useful things about it. From left to right, the four
extra bits are interpreted as follows:</p><blockquote class="footnote">
<a name="FOOTNOTE-56" /><p>[56] The ASCII character set was conveniently designed so that the case of a
letter is always determined by bit 5. To put it another way, adding 32 to
an uppercase character code gives you the code for its lowercase version.</p>
</blockquote>
<ul><li>
<a name="INDEX-572" />
<a name="INDEX-573" />
<p>The first character's case bit indicates whether the chunk is critical
(uppercase) or ancillary; a decoder that doesn't recognize the chunk type
can ignore it if it is ancillary, but it must warn the user that it cannot
correctly display the image if it encounters an unknown critical chunk.
The tEXt chunk, covered in <a href="chapter11.html">Chapter 11, "PNG Options and Extensions"</a>, is
an example of an ancillary chunk.</p></li><li>
<a name="INDEX-574" />
<a name="INDEX-575" />
<p>The second character indicates whether the chunk is public (uppercase)
or private. Public chunks are those defined in the specification or
registered as official, special-purpose types. But a company may wish
to encode its own, application-specific information in a PNG file, and
private chunks are one way to do that.</p></li><li><p>The case bit of the third character is reserved for use by future versions
of the PNG specification. It must be uppercase for PNG 1.0 and 1.1 files,
but a decoder encountering an unknown chunk with a lowercase third character
should deal with it as with any other unknown chunk.</p></li><li><p>The last character's case bit is intended for image editors rather than
simple viewers or other decoders. It indicates whether an editing program
encountering an unknown <em class="emphasis">ancillary</em> chunk<a href="#FOOTNOTE-57">[57]</a>
can safely copy it into the new file (lowercase) or not (uppercase). If an
unknown chunk is marked unsafe to copy, then it depends on the image data in
some way. It must be omitted from the new image if any <em class="emphasis">critical</em> chunks
have been modified in any way, including the addition of new ones or the
reordering or deletion of existing ones. Note that if the program recognizes
the chunk, it may choose to modify it appropriately and then copy it to the new
file. Also note that unsafe-to-copy chunks may be copied to the new file if
only ancillary chunks have been modified--again, including addition, deletion,
and reordering--which implies that ancillary chunks cannot depend on other
ancillary chunks.</p><blockquote class="footnote">
<a name="FOOTNOTE-57" /><p>[57] Since any decoder encountering an unknown critical chunk has no idea
how the chunk modifies the image--only that it does so in a critical
way--an editor cannot safely copy <em class="emphasis">or</em> omit the chunk in the new
image.</p>
</blockquote></li></ul><p><a name="INDEX-576" /></p>
</div>
</div>
<div class="sect1"><a name="png.ch08.div.2" />
<h2 class="sect1">8.2. PNG Signature</h2>
<p><a name="INDEX-577" />
<a name="INDEX-578" />
<a name="INDEX-579" />So chunk names encode additional information that is primarily useful if
the chunk is not recognized. The remainder of this book will be concerned
with known chunks, but before we turn to those, there is one more component
of PNG files that has to do with the unknown: the PNG file signature. As
noted earlier, the first 8 bytes of the file are not, strictly speaking,
a chunk.<a href="#FOOTNOTE-58">[58]</a>
They are a critical component of a PNG file, however, since they
allow it to be identified as such regardless of filename. But the PNG
signature bytes are more than a simple identifier code: they were cleverly
designed to allow the most common types of file-transfer corruption to be
detected. Web protocols these days typically ensure the correct transfer
of binary files such as PNG images, but older transfer programs like the
venerable command-line FTP (File Transfer Protocol)
<a name="INDEX-580" />
<a name="INDEX-581" />
often default to text-mode or ``ASCII'' transfers. The unsuspecting user who
transfers a PNG image or other binary file as text is practically guaranteed
of destroying it. The same is true of the user who extracts a PNG file from
a compressed archive in text mode or who emails it without some form of
``ASCII armor'' (such as MIME Base64 encoding or Unix uuencoding).</p><blockquote class="footnote">
<a name="FOOTNOTE-58" /><p>[58] They can be thought of as such, however, since their length is known
(8 bytes), their position and purpose are known (beginning of the
file; signature), and their CRC is implied (the 8 bytes are
constant, so effectively they are their own CRC).</p>
</blockquote>
<p>The 8-byte PNG file signature can detect this sort of problem because
it simulates a text file in some respects. The 8 bytes are given in
<a href="#png.ch08.tbl.1">Table 8-1</a>.</p>
<a name="png.ch08.tbl.1" />
<div class="table" align="center">
<p>
<table width="502" border="0">
<tr>
<td>
<b class="emphasis-bold">Table 8-1.</b>
<i>PNG Signature Bytes</i>
</td>
</tr>
</table>
</p>
<p>
<table width="502" border="1">
<tr>
<td><b class="emphasis-bold">Decimal<br />Value</b></td>
<td><b class="emphasis-bold">ASCII Interpretation</b></td>
</tr>
<tr>
<td>137</td>
<td>A byte with its most significant bit set (``8-bit character'')</td>
</tr>
<tr>
<td>80</td>
<td>P</td>
</tr>
<tr>
<td>78</td>
<td>N</td>
</tr>
<tr>
<td>71</td>
<td>G</td>
</tr>
<tr>
<td>13</td>
<td>Carriage-return (CR) character, a.k.a. CTRL-M or ^M</td>
</tr>
<tr>
<td>10</td>
<td>Line-feed (LF) character, a.k.a. CTRL-J or ^J</td>
</tr>
<tr>
<td>26</td>
<td>CTRL-Z or ^Z</td>
</tr>
<tr>
<td>10</td>
<td>Line-feed (LF) character, a.k.a. CTRL-J or ^J</td>
</tr>
<?x-space 5p?>
</table>
</p>
</div>
<p>The first byte is used to detect transmission over a 7-bit channel--for
example, email transfer programs often strip the 8th bit, thus changing
the PNG signature. The 2nd, 3rd, and 4th bytes simply spell ``PNG''
(in ASCII, that is). Bytes 5 and 6 are end-of-line characters for
Macintosh and Unix, respectively, and the combination of the two is the
standard line ending for DOS, Windows, and OS/2. Byte 7 (CTRL-Z)
is the end-of-file character for DOS text files, which allows one to
TYPE the PNG file under DOS-like operating systems and see only
the acronym ``PNG'' preceded by one strange character, rather than page
after page of gobbledygook. Byte 8 is another Unix end-of-line character.</p>
<p>Text-mode transfer of a PNG file from a DOS-like system to Unix will strip off
the carriage return (byte 5); the reverse transfer will replace byte 8
with a CR/LF pair. Transfer to or from a Macintosh will strip off the line
feeds or replace the carriage return with a line feed, respectively.
Either way, the signature is altered, and in all likelihood the remainder of
the file is irreversibly damaged.</p>
<p>Note that the 9th, 10th, and 11th bytes are guaranteed to be 0 (that
is, the ASCII NUL character) by the fact that the first chunk is required to
be IHDR, whose first 4 bytes are its length--a value that is currently
13 and, according to the spec, will never change. (Instead, ``new chunk types
will be added to carry new information.'') The fact that the 0 bytes in
the length come first is another benefit of the big-endian integer format,
which stores the high-order bytes first. Since NUL bytes are also often
stripped out by text-mode transfer protocols, the detection of damaged PNG
files is even more robust than the signature alone would suggest.
<a name="INDEX-582" />
<a name="INDEX-583" />
<a name="INDEX-584" />
</p>
</div>
<div class="sect1"><a name="png.ch08.div.3" />
<h2 class="sect1">8.3. A Word on Color Representation</h2>
<p><a name="INDEX-585" />Before we start putting chunks together, however, a brief interlude on the
representation and terminology of color is useful. Color fundamentally refers
to a property of light--namely, its wavelength. Each color in the rainbow,
from red to purple, is
<?x-need 10?>a relatively pure strain of wavelengths of light,
and none of these colors can be generated by adding together any of the
others.<a href="#FOOTNOTE-59">[59]</a>
Furthermore, despite what our eyeballs would have us think, the
spectrum does not end at deep purple; beyond that are the ultraviolet, X-ray,
and gamma-ray domains. Nor does it end at dull red--smoke on the water glows
in the infrared, if only we could see it, and still further down the spectrum
are radio waves.<a href="#FOOTNOTE-60">[60]</a>
Each of these wavelength regions, from radio on up to gamma, is a color.</p><blockquote class="footnote">
<a name="FOOTNOTE-59" /><p>[59] Mathematically, this is known as <em class="emphasis">orthogonality</em> and is the basis
for Fourier decomposition, among other things.</p>
</blockquote><blockquote class="footnote">
<a name="FOOTNOTE-60" /><p>[60] It is probably not coincidence that the range of light visible to our
water-filled orbs just happens to be the precise range of wavelengths
that is <em class="emphasis">not</em> strongly absorbed by water.</p>
</blockquote>
<p><a name="INDEX-586" />
<a name="INDEX-587" />So when someone refers to an RGB image--that is, containing only red,
green, and blue values--as ``truecolor,'' what twisted logic lies behind
such a claim? The answer lies not in physics but in physiology. Human eyes
contain only three classes of color sensors, which trigger color sensations
in the brain in ways that are not yet fully understood. One might guess that
these sensors (the <em class="emphasis">cones</em>) are tuned to red, green, and blue light, but
that turns out not to be the case, at least not directly. Instead, signals
from the three types of cones are added and subtracted in various ways,
apparently in more than one stage. The details are not especially important;
what matters is that the end result is a set of only three signals going
into the brain, corresponding to luminosity (or brightness), a red-versus-green
intensity level, and a yellow-versus-blue level. In addition, the cones are
not narrow-band sensors, but instead each responds to a broad range of
wavelengths. The upshot is that the human visual system is relatively poor
at analyzing colors, so feeding it different combinations of red, green, and
blue light suffices to fool it into thinking it is seeing an entire spectrum.
Keep in mind, however, that while true yellow and a combination of red and
green may look identical to us, to spectrometers (or nonhuman eyes) they are
quite different.
</p>
<p><a name="INDEX-588" />In fact, even printers ``see'' color differently. Since they employ pigments,
which absorb light rather than emit it, the RGB color space that works so well
for computer monitors is inappropriate. Instead, use a ``dual'' color
space based on cyan, magenta, and yellow, or CMYK for short.<a href="#FOOTNOTE-61">[61]</a>
<a name="INDEX-589" />
<a name="INDEX-590" />
<a name="INDEX-591" />And in video processing, television, and the JPEG image format, yet another
set of color spaces is popular: YUV, YIQ, and YC<sub class="subscript">b</sub>C<sub class="subscript">r</sub>, all
of which represent light as an intensity value (Y) and a pair of orthogonal
color vectors (U and V, or I and Q, or C<sub class="subscript">b</sub> and C<sub class="subscript">r</sub>). All
of these color spaces are beyond the scope of this book, but note that
<em class="emphasis">every single one of them has its basis in human physiology</em>. Indeed,
if YUV and its brethren sound quite a lot like the set of three signals
going into the brain that I just discussed, rest assured that it's not
coincidence. Not a single color space in common use today truly represents
the full continuum of physical color.</p><blockquote class="footnote">
<a name="FOOTNOTE-61" /><p>[61] The <em class="emphasis">K</em> is for black. Since black is the preferred color for a huge class
of printed material, including text, it is more efficient and considerably
cheaper to use a single pigment for it than always to be mixing the other
three. Some printing systems actually use five, six, or even seven distinct
pigments.</p>
</blockquote>
<p>Finally, note that image files may represent the appearance of a scene not
only as a self-contained item, but also in reference to a background or to
other images or text. In particular, transparency information is often
<a name="INDEX-592" />desirable. The simplest approach to transparency in computer graphics is
to mark a particular color as transparent, but more complex applications
will generally require a completely separate channel of information. This
<a name="INDEX-593" />is known as an <em class="emphasis">alpha channel</em> (or sometimes an alpha mask) and enables
the use of partial transparency, such as is often used in television overlays.
In the text that follows, I will
refer to an RGB image with an alpha channel as an <em class="emphasis">RGBA</em> image. PNG
adheres to the usual convention that alpha represents opacity; that is, an
alpha value of 0 is fully transparent, and the maximum value for the pixel
<a name="INDEX-594" />depth is completely opaque. PNG also uses only <em class="emphasis">unassociated</em> alpha,
wherein the actual gray or color values are stored unchanged and are only
affected by the alpha channel at display time. The alternative is
<a name="INDEX-595" />
<a name="INDEX-596" /><em class="emphasis">associated</em> or <em class="emphasis">premultiplied</em> alpha, in which the pixel values
are effectively precomposited against a black background; although this
allows slightly faster software compositing, it amounts to a lossy
transformation of the image data and was therefore rejected in the design
of PNG.
<a name="INDEX-597" /></p>
</div>
<div class="sect1"><a name="png.ch08.div.4" />
<h2 class="sect1">8.4. The Simplest PNG</h2>
<p><a name="INDEX-598" />We've looked at the fine details of a PNG file--the subatomic structure,
if you will--so let us turn now to a few of the basic atoms (chunks) that
will allow us to create a complete ``molecule,'' or valid Portable Network
Graphics file. The simplest possible PNG file, diagrammed in
<a href="#png.ch08.fig.2">Figure 8-2</a>, is composed of the PNG signature and only three chunk types: the image header
chunk, IHDR; the image data chunk, IDAT; and the end-of-image chunk, IEND.
<a name="INDEX-599" />
<a name="INDEX-600" />IHDR must be the first chunk in a PNG image, and it includes all of the
details about the type of the image: its height and width, pixel depth,
compression and filtering methods, interlacing method, whether it has an
alpha (transparency) channel, and whether it's a truecolor, grayscale, or
colormapped (palette) image. Not all combinations of image types are valid,
however, and much of the remainder of this chapter will be devoted to a
discussion of what is allowed.
</p>
<a name="png.ch08.fig.2" />
<div class="figure" align="center">
<p>
<table width=502>
<tr>
<td>
<img width=502 height=124 border=0
src="figs/png.0802.png" alt="Figure 8-2" /><br>
<br>
<b>Figure 8-2:</b> <i>Layout of the simplest PNG.</i>
</td>
</tr>
</table>
</p>
</div>
<a name="INDEX-601" />
<a name="INDEX-602" />
<p>IDAT contains all of the image's compressed pixel data. Although single
IDATs are perfectly valid as long as they contain no more than 2 gigabytes
of compressed data, in most images the compressed data is split into several
IDAT chunks for greater robustness. Since the chunk's CRC is at the end, a
streaming application that encounters a large IDAT can either force the user
to wait until the complete chunk arrives before displaying anything, or it
can begin displaying the image without knowing if it's valid. In the latter
case, if the IDAT happens to be damaged, the user will see garbage on the
display. (Since the image dimensions were already read from a previously
CRC-checked chunk, in theory the garbage will be restricted to the region
belonging to the image.) Fortunately, small IDAT chunks are by far the most
common, particularly in sizes of 8 or 32 kilobytes.</p>
<p><a name="INDEX-603" />
<a name="INDEX-604" />IEND is the simplest chunk of all; it contains no data, just indicates
that there are no more chunks in the image. IEND is primarily useful
when the PNG image is being transferred over the network as a stream,
especially when it is part of a larger MNG stream (<a href="chapter12.html">Chapter 12, "Multiple-Image Network Graphics"</a>). And
it serves as one more check that the PNG file is complete and
internally self-consistent.</p>
<p>These three chunk types are sufficient to build truecolor and grayscale PNG
files, with or without an alpha channel, but palette-based images require one
more: PLTE, the palette chunk. PLTE simply contains a sequence of red, green,
and blue values, where a value of 0 is black and 255 is full intensity;
anywhere from 1 to 256 RGB triplets are allowed, depending on the pixel
depth of the image. (That is, for a 4-bit image, no more than 16 palette
entries are allowed.) The PLTE chunk must come before the first IDAT chunk;
the structure of a colormapped PNG is shown in
<a href="#png.ch08.fig.3">Figure 8-3</a>.
</p>
<a name="png.ch08.fig.3" />
<div class="figure" align="center">
<p>
<table width=502>
<tr>
<td>
<img width=502 height=145 border=0
src="figs/png.0803.png" alt="Figure 8-3" /><br>
<br>
<b>Figure 8-3:</b> <i>Layout of the second-simplest PNG.</i>
</td>
</tr>
</table>
</p>
</div>
<p>
<a name="INDEX-605" />
</p>
</div>
<div class="sect1"><a name="png.ch08.div.5" />
<h2 class="sect1">8.5. PNG Image Types</h2>
<p><a name="INDEX-606" />I noted earlier that not all possible combinations of PNG image types
and features are allowed by the specification. Let's take a closer
look at the basic types and their features.</p>
<a name="png.ch08.div.5.1" /><div class="sect2">
<h3 class="sect2">8.5.1. Palette-Based</h3>
<p><a name="INDEX-607" />
<a name="INDEX-608" />
<a name="INDEX-609" />
<a name="INDEX-610" />Palette-based images, also known as colormapped or index-color images, use
the PLTE chunk and are supported in four pixel depths: 1, 2, 4, and 8 bits,
corresponding to a maximum of 2, 4, 16, or 256 palette entries. Unlike GIF
images, however, fewer than the maximum number of entries may be present.
On the other hand, GIF does support pixel depths of 3, 5, 6, and 7 bits;
6-bit (64-color) images, in particular, are common on the World Wide Web.
</p>
<p><a name="INDEX-611" />TIFF also supports palette images, but baseline TIFF allows only 4-
and 8-bit pixel depths. Perhaps a more useful comparison is with the
superset of baseline TIFF that is supported by Sam Leffler's free
<a name="INDEX-612" />
<a name="INDEX-613" />libtiff, which has become the software industry's unofficial standard
for TIFF decoding. libtiff supports palette bit depths of 1, 2, 4, 8,
and 16 bits. Unlike PNG and GIF, however, the TIFF palette always
uses 16-bit integers for each red, green, and blue value, and as with
GIF, all 2<sup class="superscript"><em class="emphasis">bit depth</em></sup>
entries must be present in the file. Nor is there any provision for
compression of the palette data--so a 16-bit TIFF palette would require
384 KB all by itself.</p>
</div>
<a name="png.ch08.div.5.2" /><div class="sect2">
<h3 class="sect2">8.5.2. Palette-Based with Transparency</h3>
<p><a name="INDEX-614" />
<a name="INDEX-615" />
<a name="INDEX-616" />
<a name="INDEX-617" />
<a name="INDEX-618" />The PNG spec forbids the use of a full alpha channel
with palette-based images,
but it does allow ``cheap alpha'' via the transparency chunk, tRNS. As its
name implies--the first letter is lowercase--tRNS is an ancillary chunk,
which means the image is still viewable even if the decoder somehow fails to
recognize the chunk.<a href="#FOOTNOTE-62">[62]</a>
The structure of tRNS depends on the image type, but
for palette-based images it is exactly analogous to the PLTE chunk. It may
contain as many transparency entries as there are palette entries (more than
that would not make any sense) or as few as one, and it must come after PLTE
and before the first IDAT. In effect, it transforms the palette from an RGB
lookup table to an RGBA table, which implies a potential factor-of-four
savings in file size over a full 32-bit RGBA image. The icicle image used
as a basis for <A HREF="fig_C1.html">Figure C-1</A> in the color insert is
an RGBA-palette image; it is ``only'' 3.85 times smaller than the 32-bit
original due to dithering (which hurts compression).
</p><blockquote class="footnote">
<a name="FOOTNOTE-62" /><p>[62] Once again, the distinction between critical and ancillary chunks is largely
irrelevant for chunks defined in the specification, since presumably they
<a name="INDEX-619" />
<a name="INDEX-620" />are known by all decoders. But even the names of standard chunks were chosen
in accordance with the rules, as if they might be encountered by a
particularly simple-minded PNG decoder. In fact, this was done in order to
test the chunk-naming rules themselves: would a decoder that relied only on
them behave sensibly? The answer was ``yes.''</p>
</blockquote>
<p><a name="INDEX-621" />By comparison, GIF supports only binary transparency, wherein a single
palette color is marked as completely transparent, while all others
are fully opaque. GIF has a tiny advantage in that the transparent
entry can live anywhere in the palette, whereas a single PNG
transparency entry should come first--all tRNS entries before the
transparent one must exist and must have the value 255 (fully opaque),
which would be redundant and therefore a waste of space. But the code
necessary to rearrange the palette so that all non-opaque entries come
before any opaque ones is simple to write, and the benefits of PNG's
more flexible transparency scheme far outweigh this minor drawback.
</p>
<p><a name="INDEX-622" />The TIFF format supports at least three kinds of transparency information,
two involving an interleaved alpha channel (<em class="emphasis">extra samples</em>) and the third
involving a completely separate subimage (or <em class="emphasis">subfile</em>) that is used as a
bilevel transparency mask. Baseline TIFF does not require support for any
of them, but libtiff supports the two interleaved flavors directly,
and could probably be manhandled into some level of support for the subfile
approach, although the transparency mask is ``typically at a higher
resolution than the main image if the main image is grayscale or color,''
according to the TIFF 6.0 specification.
On the other hand, with the possible exception of user-designed TIFF tags,
there is no support at all for ``cheap alpha,'' i.e., marking one or more
palette entries as partially or completely transparent.
<a name="INDEX-623" />
<a name="INDEX-624" />
</p>
</div>
<a name="png.ch08.div.5.3" /><div class="sect2">
<h3 class="sect2">8.5.3. Grayscale</h3>
<p><a name="INDEX-625" />PNG grayscale images support the widest range of pixel depths of any image
type. Depths of 1, 2, 4, 8, and 16 bits are supported, covering
everything from simple black-and-white scans to full-depth medical and
raw astronomical images.<a href="#FOOTNOTE-63">[63]</a>
</p><blockquote class="footnote">
<a name="FOOTNOTE-63" /><p>[63] Calibrated astronomical image data is usually stored as 32-bit or
64-bit floating-point values, and some raw data is represented as
32-bit integers. Neither format is directly supported by PNG, although
one could, in principle, design an ancillary chunk to hold the proper
conversion information. Conversion of data with more than 16 bits of
dynamic range would be a lossy transformation, however--at least,
barring the abuse of PNG's alpha channel or RGB capabilities.</p>
</blockquote>
<p><a name="INDEX-626" />
<a name="INDEX-627" />There is no direct comparison with GIF images, although it is certainly
possible to store grayscale data in a palette image for both GIF and PNG.
The only place a gray palette is commonly distinguished from a regular color
one, however, is in VRML97 texture maps. Baseline TIFF images, on the other
hand, support 1-bit ``bilevel'' and 4- and 8-bit grayscale depths.
Nonbaseline TIFF allows arbitrary bit depths, but libtiff accepts
only 1-, 2-, 4-, 8-, and 16-bit images. TIFF also supports an inverted
grayscale, wherein 0 represents white and the maximum pixel value represents
black.
</p>
<p><a name="INDEX-628" />The most common form of JPEG (the one that uses ``lossy'' compression, in
which some information in the image is thrown away) likewise supports
grayscale images in depths of 8 and 12 bits. In addition, there are two
variants that use truly lossless compression and support any depth from 2 to
16 bits: the traditional version, known simply as ``lossless JPEG,'' and an
upcoming second-generation flavor called ``JPEG-LS.''<a href="#FOOTNOTE-64">[64]</a>
But the first is extremely
rare, and is supported by almost no one, despite having been standardized years
ago, and the second is also currently unsupported (although that is to be expected
for a new format). Lossy JPEG is very well supported, thanks largely to
<a name="INDEX-629" />the Independent JPEG Group's free <em class="emphasis">libjpeg</em> (which, like libtiff,
has become the de facto standard for JPEG encoding and decoding)--but,
of course, it's lossy. Note that libjpeg can be compiled to support
either 8-bit or 12-bit JPEG, but not both at the same time. Thus, from a
practical standpoint, only 8-bit, lossy grayscale is supported.</p><blockquote class="footnote">
<a name="FOOTNOTE-64" /><p>[64] Be aware that even at the highest quality settings, the common form of JPEG
is never lossless, regardless of whether the setting claims 100% or
something similar.</p>
</blockquote>
</div>
<a name="png.ch08.div.5.4" /><div class="sect2">
<h3 class="sect2">8.5.4. Grayscale with Transparency</h3>
<p><a name="INDEX-630" />
<a name="INDEX-631" />PNG supports two kinds of transparency with grayscale and RGB images.
The first is a palette-style ``cheap transparency,'' in which a single
color or gray value is marked as being fully transparent. I noted
earlier that the structure of tRNS depends on the image type; for
grayscale images of any pixel depth, the chunk contains a 2-byte,
unscaled gray value--that is, the maximum allowed value is still
2<sup class="superscript"><em class="emphasis">bit depth</em></sup>-1, even though it is stored as a 16-bit
integer. This approach is very similar to GIF-style transparency in
palette images and incurs only 14 bytes overhead in file size. There
is no corresponding TIFF image type, and standard JPEG does not
support any transparency.</p>
</div>
<a name="png.ch08.div.5.5" /><div class="sect2">
<h3 class="sect2">8.5.5. Grayscale with Alpha Channel</h3>
<p><a name="INDEX-632" />The second kind of transparency supported by grayscale images is an alpha
channel. This is a more expensive approach in terms of file size--for
grayscale, it doubles the number of image bytes--but it allows the user
much greater freedom in setting individual pixels to particular levels
of partial transparency. Only 8-bit and 16-bit grayscale images may have
an alpha channel, which must match the bit depth of the gray channel.</p>
<p><a name="INDEX-633" />The full TIFF specification supports two kinds of interleaved ``extra samples''
for transparency: associated and unassociated alpha (though not at the
same time). Unlike PNG, TIFF's alpha channel may be of a different bit depth
from the main image data--in fact, every channel in a TIFF image may have an
arbitrary depth. TIFF also offers the explicit possibility of treating
a ``subfile,'' or secondary image within the file, as a transparency mask,
though such masks are only 1 bit deep, and therefore support only completely
opaque or completely transparent pixels.</p>
<p>Baseline TIFF does not require support for any of this, however. Current
versions of <em class="emphasis">libtiff</em> can read an interleaved alpha channel as generic
``extra samples,'' but it is up to the application to interpret the
samples correctly. The library does not support images with channels of
different depths, and although it could be manipulated into reading a
secondary grayscale subfile (which the application could interpret as a
full alpha channel), that would be a user-defined extension--i.e., specific
to the application and not supported by any other software.</p>
<p><a name="INDEX-634" />As I just noted, standard JPEG (by which I mean the common JPEG File
Interchange Format, or JFIF files) has no provision for transparency. The
JPEG standard itself does allow extra channels, one of which could be treated
as an alpha channel, but this would be fairly pointless. Not only would it
require one to use a non-standard, unsupported file format for storage, there
would also tend to be visual artifacts, since lossy JPEG is not well suited to
the types of alpha masks one typically finds (unless the mask's quality
setting were boosted considerably, at a cost in file size). But see
<a href="chapter12.html">Chapter 12, "Multiple-Image Network Graphics"</a> for details on a MNG subformat called JNG that combines a
lossy JPEG image in JFIF format with a PNG-style, lossless alpha channel.
<a name="INDEX-635" /></p>
</div>
<a name="png.ch08.div.5.6" /><div class="sect2">
<h3 class="sect2">8.5.6. RGB</h3>
<p><a name="INDEX-636" />
<a name="INDEX-637" />RGB (truecolor) PNGs, like grayscale with alpha, are supported in only two
depths: 8 and 16 bits per sample, corresponding to 24 and 48 bits per pixel.
This is the image type most commonly used by image-editing applications like
Adobe Photoshop. Note that pixels are stored in RGB order. (BGR is the other
popular format, especially on Windows-based systems.)</p>
<p><a name="INDEX-638" />
<a name="INDEX-639" />
<a name="INDEX-640" />Truecolor PNG images may also include a palette (PLTE) chunk, though the
specialized suggested-palette (sPLT) chunk described in <a href="chapter11.html">Chapter 11, "PNG Options and Extensions"</a> is often
more appropriate. But if present, the palette encodes a suggested set of
colors to which the image may be quantized if the decoder cannot display in
truecolor; the suggestion is presumed to be a <em class="emphasis">good</em> one, so decoders are
encouraged to use it if they can. Of course, multi-image viewers such as web
browsers often resort to a fixed palette for simplicity and rendering speed.
</p>
<p>Baseline TIFF requires support only for 24-bit RGB, but libtiff
supports 1, 2, 4, 8, and 16 bits per sample. Ordinary JPEG stores
only 24-bit RGB,<a href="#FOOTNOTE-65">[65]</a>
though 36-bit RGB is possible with the seldom-supported 12-bit extension.
The also seldom-supported lossless flavor of JPEG can, in theory, store any
sample depth from 2 to 16 bits, thus 6 to 48 bits per RGB pixel.
</p><blockquote class="footnote">
<a name="FOOTNOTE-65" /><p>[65] <a name="INDEX-641" />Technically, color JPEGs are almost always encoded internally in the
YC<sub class="subscript">b</sub>C<sub class="subscript">r</sub> color space and converted to or from RGB by
the decoder or encoder software.</p>
</blockquote>
</div>
<a name="png.ch08.div.5.7" /><div class="sect2">
<h3 class="sect2">8.5.7. RGB with Transparency</h3>
<p><a name="INDEX-642" />
<a name="INDEX-643" />As mentioned previously, PNG supports cheap transparency in RGB images via the
tRNS chunk. The format is similar to that for grayscale images, except now
the chunk contains <em class="emphasis">three</em> unscaled, 16-bit values (red, green, and blue),
and the corresponding RGB pixel is treated as fully transparent. This option
adds only 18 bytes to the image, and there are no corresponding TIFF or JPEG
image types.</p>
</div>
<a name="png.ch08.div.5.8" /><div class="sect2">
<h3 class="sect2">8.5.8. RGB with Alpha Channel</h3>
<p><a name="INDEX-644" />
<a name="INDEX-645" />Finally, we have truecolor images with an alpha channel, also known as the
RGBA image type. As with RGB and gray+alpha, PNG supports 8 and 16 bits per
sample for RGBA or 32 and 64 bits per pixel, respectively. Pixels are always
stored in RGBA order, and the alpha channel is not premultiplied.</p>
<p><a name="INDEX-646" />
<a name="INDEX-647" />
<a name="INDEX-648" />The use of PLTE for a suggested quantization palette is allowed here as well,
but note that since the tRNS chunk is prohibited in RGBA images, the suggested
palette can only encode a recommended quantization for the RGB data or for the
RGBA data composited against the image's background color (see the discussion
of bKGD in <a href="chapter11.html">Chapter 11, "PNG Options and Extensions"</a>), not for the raw RGBA data. Disallowing tRNS is
arguably an unnecessary restriction
in the PNG specification; while a suggested RGBA palette would not necessarily
be useful when compositing the image against a varied background (the different
background pixel values would likely mix with the foreground pixels to form
more than 256 colors), it would be helpful for cases where the background is
a solid color. In fact, this restriction was recognized and addressed by an
extension to the specification approved late in 1996: the suggested-palette
chunk, sPLT, which is discussed in <a href="chapter11.html">Chapter 11, "PNG Options and Extensions"</a>.</p>
<p><a name="INDEX-649" />Although baseline TIFF does not require support for an alpha channel,
libtiff supports RGBA images with 1, 2, 4, 8, or 16 bits per sample;
both associated and unassociated alpha channels are supported. JPEG has
no direct support for alpha transparency, but MNG offers a way around that
(see <a href="chapter12.html">Chapter 12, "Multiple-Image Network Graphics"</a>).
<a name="INDEX-650" />
<a name="INDEX-651" />
<a name="INDEX-652" />
<a name="INDEX-653" />
</p>
</div>
</div>
<div class="sect1"><a name="png.ch08.div.6" />
<h2 class="sect1">8.6. Interlacing and Progressive Display</h2>
<p><a name="INDEX-654" />
<a name="INDEX-655" />We'll wrap up our look at the basic elements of Portable Network Graphics
images with a quick consideration of progressive rendering and interlacing.
Most computer users these days are familiar with the World Wide Web and the
method by which modern browsers present pages. As a rule, the textual
part of a web page is displayed first, since it is transmitted as part of
the page; then images are displayed, with each one rendered as it comes
across the network. Ordinary images are simply painted from the top down,
a few lines at a time; this is the most basic form of progressive display.
</p>
<p>Some images, however, are in a format that allows them to be rendered as an
overall, low-resolution image first, followed by one or more passes that
refine it until the complete, full-resolution image is displayed. For
GIF and PNG images this is known as <em class="emphasis">interlacing</em>. GIF's approach has
four passes and is based on complete rows of the image, making it a
one-dimensional method. First every eighth row is displayed; then every
eighth row is displayed again, only this time offset by four rows from the
initial pass. The third pass consists of every fourth row, and the final
pass includes every other row (half of the image).</p>
<p><a name="INDEX-656" />
<a name="INDEX-657" />
PNG's interlacing method, on the other hand, is a two-dimensional scheme
with seven passes, known as the Adam7 method (after its inventor, Adam
Costello). If one imagines the image being broken up into
8 × 8-pixel
tiles, then the first pass consists of the upper left pixel in each tile--that
is, every eighth pixel, both vertically and horizontally. The second
pass also consists of every eighth pixel, but offset four pixels to the right.
</p>
<a name="png.ch08.fig.4" />
<div class="figure" align="center">
<p>
<table width=502>
<tr>
<td>
<img width=502 height=225 border=0
src="figs/png.0804.png" alt="Figure 8-4" /><br>
<br>
<b>Figure 8-4:</b> <i>Schematic of an 8 × 8 tile
(a) after the third pass and (b) after the fifth pass.</i>
</td>
</tr>
</table>
</p>
</div>
<p>The third pass consists of two pixels per tile, offset by four rows from
the first two pixels (see
<a href="#png.ch08.fig.4">Figure 8-4</a>a). The fourth pass contains four pixels in each tile, offset two columns
to the right of each of the first four pixels, and the fifth pass
contains eight pixels, offset two rows downward (see
<a href="#png.ch08.fig.4">Figure 8-4</a>b).
The sixth pass fills in the remaining pixels on the odd rows (if the
image is numbered starting with row one), and the seventh pass
contains all of the pixels for the even rows. Note that, although I've
described the method in terms of 8 × 8 tiles, pixels for any given
pass are stored as complete rows, not as tiled groups. For example,
the fifth pass consists of every other pixel in the entire third row
of the image, followed by every other pixel in the seventh row, and so
on.</p>
<p><a name="INDEX-658" />The primary benefit of PNG's two-dimensional interlacing over GIF's
one-dimensional scheme is that one can view a crude approximation of the
entire image roughly eight times as fast.<a href="#FOOTNOTE-66">[66]</a>
That is, PNG's first pass consists of one sixty-fourth of the image
pixels, whereas GIF's first pass consists of one-eighth of the
data. Suppose one were to save a palette image as both an interlaced
GIF and an interlaced PNG. Assuming the compression ratio and download
speeds were identical for the two files, the PNG image would have
completed its fourth pass as the GIF image completed its first. But
most browsers that support progressive display do so by replicating
pixels to fill in the areas that haven't arrived yet. For the PNG
image, that means each pixel at this stage represents a 2 × 4
block, whereas each GIF pixel represents a 1 × 8 strip. In other
words, GIF pixels have an 8-to-1 aspect ratio, whereas PNG pixels are
2-to-1. At the end of the next pass for each format (GIF's second
pass, PNG's fifth; one-quarter of the image in both cases), the PNG
pixels are square 2 × 2 blocks, while the GIF pixels are still
stretched, now as 1 × 4 strips. In practical terms, features in the
PNG image--particularly embedded text--are much more recognizable
than in the GIF image. In fact, readability testing suggests that text
of any given size is legible roughly twice as fast with PNG's
interlacing method.
<?x-space -2p?></p><blockquote class="footnote">
<a name="FOOTNOTE-66" /><p>[66] As I (foot)noted in <a href="chapter01.html">Chapter 1, "An Introduction to PNG"</a>, this implicitly assumes that
one-eighth of the compressed data corresponds to one-eighth of the
uncompressed (image) data, which is not quite accurate. The
difference is likely to be small in most cases, however. I'll
discuss this further in <a href="chapter09.html">Chapter 9, "Compression and Filtering"</a>.</p>
</blockquote>
<p><a name="INDEX-659" />JPEG also supports a form of progressive display, but it is not
interlacing in the usual sense of reordering the pixels spatially.
Rather, it involves reordering the frequency components that make up a
JPEG image, first displaying the low-frequency ones and working up to
<a name="INDEX-660" />
<a name="INDEX-661" />the highest frequency band; this is known as <em class="emphasis">spectral selection</em>.
In addition, progressive JPEG can transmit the most significant bits
of each frequency component earlier than the less significant ones, a
<a name="INDEX-662" />
<a name="INDEX-663" />feature known as <em class="emphasis">successive approximation</em> that is very nearly
the same as turning up the JPEG quality setting with each scan. The
two approaches can be used separately, but in practice they are almost
always used in combination. Because JPEG operates on 8 × 8 blocks
of pixels, progressive JPEG bears a strong resemblance to interlaced
PNG during the early stages of display, though it tends to have a
softer, fuzzier look due to the initial lack of high-frequency
components (which is often deliberately enhanced by smoothing in the
decoder). This is visible in
<A HREF="fig_C4.html#png.color.fig.4a">Figures C-4a</A> and
<A HREF="fig_C4.html#png.color.fig.4b">C-4b</A> in the color
insert, which represent the second pass of a progressive JPEG image
(26% of the compressed data), both unsmoothed and smoothed. Note in
particular the blockiness in the shadowed interior of the box and the
``colored outside the lines'' appearance around the child's arms and
hands; the first effect is completely eliminated in the smoothed
version, and the second is greatly reduced. JPEG's first pass is
actually more accurate than PNG's, however, since the low-frequency
band for each 8 × 8 pixel block represents an average for all 64
pixels, whereas each 8 × 8 block in PNG's first pass is represented
by a single pixel, usually in the upper left corner of the displayed
block. By its fifth pass, which represents only 40% of the compressed
data, the progressive JPEG version of this image
(<A HREF="fig_C4.html#png.color.fig.4c">Figure C-4c</A>) is
noticeably sharper and more accurate than all but the final pass of
the PNG version. Keep in mind also that, since the PNG is lossless and
therefore 11 times as large as the JPEG, 40% of the compressed JPEG
data is equivalent to only 3.5% of the PNG data, which corresponds to
the beginning of PNG's third pass. This only emphasizes the point made
previously: for non-transparent, photographic images on the Web, use
JPEG.
<?x-space -2p?></p>
<p>Note that smoothing could be applied to the early passes of interlaced PNGs
and GIFs, as well; tests suggest that this looks better for photographic
images but maybe not as good for simple graphics. (On the other hand,
recall that smoothing did seem to enhance the readability of early interlace
passes in <a href="chapter01.html#png.ch01.fig.4">Figure 1-4</a>.)
As for representing blocks
by the pixel in the upper left corner, it would be possible to replicate each
pixel so that the original would lie roughly at the center of its clones, as
long as some care were taken near the edges of the image. This would prevent
the apparent shift in some features as later passes are displayed.
<a name="INDEX-663.01-new" />
But neither
smoothing nor centered pixel replication is currently supported by the PNG
reference library, <em class="emphasis">libpng</em>, as of version 1.0.3.
<?x-space -2p?></p>
<p>
<a name="INDEX-664" />
<a name="INDEX-665" />
It is worth noting that TIFF can also support a kind of interlacing, although
like everything about TIFF, it is much more arbitrary than either GIF's or
PNG's method. Baseline TIFF includes the concept of <em class="emphasis">strips</em>, each of
which may include one or more rows of image data though the number of rows
per strip is constant. A list of offsets to each strip is embedded within
the image, so in principle one could make each strip a row and do GIF-style
line interlacing with any ordering one chose. But since TIFF's structure is
fundamentally random access in nature, this approach would only work if one
imposed certain restrictions on the locations of its internal directory, list
of strip offsets, and actual strip data--that is, one would need to define
a particular subformat of TIFF.
<?x-space -2p?></p>
<p><a name="INDEX-666" />
<a name="INDEX-667" />In addition, libtiff supports a TIFF extension called <em class="emphasis">tiles</em>, in
which the image data is organized into rectangular regions instead of
strips. Since the tile size can be arbitrary, one could define it to
be 1 × 1 and then duplicate PNG's Adam7 interlacing scheme
manually--or even extend it to 9, 11, or more passes. However,
since every tile must have a corresponding offset in the TIFF image
directory, doing something like this would at least double or triple
<a name="INDEX-668" />the image size. Also, TIFF's compression methods apply only to
individual strips or tiles, so there would be no real possibility of
compression aside from reusing tiles in more than one location (that
is, by having multiple tile offsets point at the same data). And, as
with the strip approach, this would require restrictions on the
internal layout of the file. Nevertheless, the capability does exist,
at least theoretically.
<a name="INDEX-669" />
<a name="INDEX-670" />
<a name="INDEX-671" />
</p>
</div>
<pre>
</pre>
<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
<a href="chapter07.html"><img width=24 height=13 border=0 align="left"
src="images/prev.png" alt="<-"></a>
<a href="chapter09.html"><img width=24 height=13 border=0 align="right"
src="images/next.png" alt="->"></a>
<div align="center">
<a href="chapter07.html"><font size="-1" color="#000000"
><b>PREVIOUS</b></font></a> <a
href="toc.html"><font size="-1" color="#000000"
><b>CONTENTS</b></font></a> <a
href="chapter09.html"><font size="-1" color="#000000"
><b>NEXT</b></font></a>
</div>
<hr> <!-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
</body></html>
|