I am moving some image processing functionality from .NET to Python under the constraint that the output images must be compressed in the exact same way as they were in .NET. However, when I compare the .jpg
output files on a tool like text-compare and choose Ignore nothing
, there are significant differences in how the files were compressed.
For example:
Python
bmp = PIL.Image.open('marbles.bmp')
bmp.save(
'output_python.jpg',
format='jpeg',
dpi=(300,300),
subsampling=2,
quality=75
)
.NET
ImageCodecInfo jgpEncoder = ImageCodecInfo.GetImageDecoders().First(codec => codec.FormatID == ImageFormat.Jpeg.Guid);
EncoderParameters myEncoderParameters = new EncoderParameters(1);
myEncoderParameters.Param[0] = new EncoderParameter(Encoder.Quality, 75L);
Bitmap bmp = new Bitmap(directory + "marbles.bmp");
bmp.Save(directory + "output_net.jpg", jgpEncoder, myEncoderParameters);
exiftool output_python.jpg -a -G1 -w txt
[ExifTool] ExifTool Version Number : 12.31
[System] File Name : output_python.jpg
[System] Directory : .
[System] File Size : 148 KiB
[System] File Modification Date/Time : 2021:09:28 09:19:20-06:00
[System] File Access Date/Time : 2021:09:28 09:19:21-06:00
[System] File Creation Date/Time : 2021:09:27 21:33:35-06:00
[System] File Permissions : -rw-rw-rw-
[File] File Type : JPEG
[File] File Type Extension : jpg
[File] MIME Type : image/jpeg
[File] Image Width : 1419
[File] Image Height : 1001
[File] Encoding Process : Baseline DCT, Huffman coding
[File] Bits Per Sample : 8
[File] Color Components : 3
[File] Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2)
[JFIF] JFIF Version : 1.01
[JFIF] Resolution Unit : inches
[JFIF] X Resolution : 300
[JFIF] Y Resolution : 300
[Composite] Image Size : 1419x1001
[Composite] Megapixels : 1.4
exiftool output_net.jpg -a -G1 -w txt
[ExifTool] ExifTool Version Number : 12.31
[System] File Name : output_net.jpg
[System] Directory : .
[System] File Size : 147 KiB
[System] File Modification Date/Time : 2021:09:28 09:18:05-06:00
[System] File Access Date/Time : 2021:09:28 09:18:52-06:00
[System] File Creation Date/Time : 2021:09:27 21:32:19-06:00
[System] File Permissions : -rw-rw-rw-
[File] File Type : JPEG
[File] File Type Extension : jpg
[File] MIME Type : image/jpeg
[File] Image Width : 1419
[File] Image Height : 1001
[File] Encoding Process : Baseline DCT, Huffman coding
[File] Bits Per Sample : 8
[File] Color Components : 3
[File] Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2)
[JFIF] JFIF Version : 1.01
[JFIF] Resolution Unit : inches
[JFIF] X Resolution : 300
[JFIF] Y Resolution : 300
[Composite] Image Size : 1419x1001
[Composite] Megapixels : 1.4
Difference on text-compare
Questions
- Is it reasonable to assume that these two implementations of JPEG compression could yield identical output files?
- If so, are either
PIL
orSystem.Drawing.Image
doing any extra steps like anti-aliasing that are making the results different? - Or are there additional parameters to
PIL
.save()
to make it behave more like the JPEG encoder in C#?
Thanks
Update
Based on Jeremy's recommendation, I used JPEGsnoop to compare more details between the files and found that the Luminance and Chrominance tables were different. I modified the code:
bmp = PIL.Image.open('marbles.bmp')
output_net = PIL.Image.open('output_net.jpg')
bmp.save(
'output_python.jpg',
format='jpeg',
dpi=(300,300),
subsampling=2,
qtables=output_net.quantization,
#quality=75
)
Now the tables are the same, but the difference between the files is unchanged. The only differences JPEGsnoop shows now are in the Compression stats
and Huffman code histogram stats
.
output_net.jpeg
*** Decoding SCAN Data ***
OFFSET: 0x0000026F
Scan Decode Mode: Full IDCT (AC + DC)
Scan Data encountered marker 0xFFD9 @ 0x00024BE7.0
Compression stats:
Compression Ratio: 28.43:1
Bits per pixel: 0.84:1
Huffman code histogram stats:
Huffman Table: (Dest ID: 0, Class: DC)
# codes of length 01 bits: 0 ( 0%)
# codes of length 02 bits: 1664 ( 7%)
# codes of length 03 bits: 18238 ( 81%)
# codes of length 04 bits: 1807 ( 8%)
# codes of length 05 bits: 715 ( 3%)
# codes of length 06 bits: 4 ( 0%)
# codes of length 07 bits: 0 ( 0%)
...
output_python.jpg
*** Decoding SCAN Data ***
OFFSET: 0x0000026F
Scan Decode Mode: Full IDCT (AC + DC)
Scan Data encountered marker 0xFFD9 @ 0x00025158.0
Compression stats:
Compression Ratio: 28.17:1
Bits per pixel: 0.85:1
Huffman code histogram stats:
Huffman Table: (Dest ID: 0, Class: DC)
# codes of length 01 bits: 0 ( 0%)
# codes of length 02 bits: 1659 ( 7%)
# codes of length 03 bits: 18247 ( 81%)
# codes of length 04 bits: 1807 ( 8%)
# codes of length 05 bits: 711 ( 3%)
# codes of length 06 bits: 4 ( 0%)
# codes of length 07 bits: 0 ( 0%)
...
I am now looking for a way to sync these values through PIL
.
from JPEG compression differences in C# and Python
No comments:
Post a Comment