• Java 解析Tiff深入研究


           最近在读取客户发过来的tiff文件是,底层竟然报错了,错误:bandOffsets.length is wrong!   没办法,因为错误消息出现在tiff的read中,因此就对

    底层序中tiff读取的代码进行了研究。

           之前有一篇文章,我简单的介绍了Geotools读取Tiff的代码,Java 通过geotools读取tiff,其实通过深入研究发现,原来幕后的大佬竟然是imageio-ext中的TiffImageReader,

    imageio做为Java开发的人员肯定都知道,而ImageIO-ext是imageio的扩展类,我们可以到github上看到它的源码,这是一个非常强大的库,对于Java处理各种栅格数据的读写非常有帮助!

          借助这篇文章,我们需要先了解Tiff文件的具体结构,可以参考这篇文章,TIFF文件结构详解 https://blog.csdn.net/oYinHeZhiGuang/article/details/121710467  讲的很好!

          下面我们来看下imageio-ext中的tiff读取代码,主要类TiffImageReader,我们来看下Java程序是如何读取tiff文件的。

          构造方法:

    public TIFFImageReader(ImageReaderSpi originatingProvider) {
            super(originatingProvider);
     }

     这个类需要通过一个ImageReaderSpi来实例化,其实这种SPI的设计模式,Java的很多开源项目都在用到,这里我们通过TIFFImageReaderSpi这个类即可。

     其次设置文件的路径,以及其它一些参数,通过该类的如下方法:

    public void setInput(Object input,
                             boolean seekForwardOnly,
                             boolean ignoreMetadata)

    这个方法,里面有input就是需要读取的文件,seekForwardOnly设置为true表示:只能从这个输入源按升序读取图像和元数据。ignoreMetadata设置为true表示读取忽略元数据

    接下来就是对tiff元数据的读取,具体参见getImageMetadata(int imageIndex)这个方法:

    复制代码
    public IIOMetadata getImageMetadata(int imageIndex) throws IIOException {
            seekToImage(imageIndex, true);
            TIFFImageMetadata im =
                new TIFFImageMetadata(imageMetadata.getRootIFD().getTagSetList());
            Node root =
                imageMetadata.getAsTree(TIFFImageMetadata.nativeMetadataFormatName);
            im.setFromTree(TIFFImageMetadata.nativeMetadataFormatName, root);
            if (noData != null) {
                im.setNoData(new double[] {noData, noData});
            }
            if (scales != null && offsets != null) {
                im.setScales(scales);
                im.setOffsets(offsets);
            }
            return im;
        }
    复制代码

    其中的seekToImage(imageIndex, true)为最主要的逻辑处理,这个方法中,第一个参数,imageIndex为tiff多页中的第几个,第二参数设置标示该tiff页是否已经被解析过

    复制代码
     private void seekToImage(int imageIndex, boolean optimized) throws IIOException {
            checkIndex(imageIndex);
    
            // TODO we should do this initialization just once!!!
            int index = locateImage(imageIndex);
            if (index != imageIndex) {
                throw new IndexOutOfBoundsException("imageIndex out of bounds!");
            }
            
            final Integer i= Integer.valueOf(index);
            //optimized branch
            if(!optimized){
                
                readMetadata();
                initializeFromMetadata();
                return;
            }
            // in case we have cache the info for this page
            if(pagesInfo.containsKey(i)){
                // initialize from cachedinfo only if needed
                // TODO Improve
                if(imageMetadata == null || !initialized) {// this means the curindex has changed
                    final PageInfo info = pagesInfo.get(i);
                    final TIFFImageMetadata metadata = info.imageMetadata.get();
                    if (metadata != null) {
                        initializeFromCachedInfo(info, metadata);
                        return;
                    }
                    pagesInfo.put(i,null);
                        
                }
            }
            
            readMetadata();
            initializeFromMetadata();
        }
    复制代码

    这个方法当中,第一次加载tiff,通过readMetadata()和initializeFromMetadata()将tiff的元信息缓存起来,方便后面再次读取。

    读取过程

    主要是要结合Tiff的格式进行理解,大体主要是解析tiff头,然后获取到IFD(tiff的图像目录信息),然后再依次去解析每个目录的具体内容,代码就不再这里罗列了。

    这里主要说下,解析目录信息是获取tiff的元信息的过程,通常是解析每个tag的信息,解析代码TIFFIFD类的initialize(ImageInputStream stream,  boolean ignoreUnknownFields, final boolean isBTIFF)方法中

    public void initialize(ImageInputStream stream,
                boolean ignoreUnknownFields, final boolean isBTIFF) throws IOException {
            removeTIFFFields();
    
            List tagSetList = getTagSetList();
    
           
            final long numEntries;
            if(isBTIFF)
                numEntries= stream.readLong();
            else
                numEntries= stream.readUnsignedShort();
            
            for (int i = 0; i < numEntries; i++) {
                // Read tag number, value type, and value count.
                int tag = stream.readUnsignedShort();
                int type = stream.readUnsignedShort();
                int count;
                if(isBTIFF)
                {
                    long count_=stream.readLong();
                    count = (int)count_;
                    if(count!=count_)
                        throw new IllegalArgumentException("unable to use long number of values");
                }
                else            
                    count = (int)stream.readUnsignedInt();
    
                // Get the associated TIFFTag.
                TIFFTag tiffTag = getTag(tag, tagSetList);
    
                // Ignore unknown fields.
                if(ignoreUnknownFields && tiffTag == null) {
                    // Skip the value/offset so as to leave the stream
                    // position at the start of the next IFD entry.
    
                    if(isBTIFF)
                        stream.skipBytes(8);
                    else
                        stream.skipBytes(4);
    
                    // XXX Warning message ...
    
                    // Continue with the next IFD entry.
                    continue;
                }
           
                long nextTagOffset;
                
                if(isBTIFF){
                    nextTagOffset = stream.getStreamPosition() + 8;
                    int sizeOfType = TIFFTag.getSizeOfType(type);
                    if (count*sizeOfType > 8) {
                        long value = stream.readLong();
                        stream.seek(value);
                     }
                }
                else{                
                    nextTagOffset = stream.getStreamPosition() + 4;
                    int sizeOfType = TIFFTag.getSizeOfType(type);
                     if (count*sizeOfType > 4) {
                        long value = stream.readUnsignedInt();
                        stream.seek(value);
                     }
                }
                
                if (tag == BaselineTIFFTagSet.TAG_STRIP_BYTE_COUNTS ||
                    tag == BaselineTIFFTagSet.TAG_TILE_BYTE_COUNTS ||
                    tag == BaselineTIFFTagSet.TAG_JPEG_INTERCHANGE_FORMAT_LENGTH) {
                    this.stripOrTileByteCountsPosition =
                        stream.getStreamPosition();
                    if (LAZY_LOADING) {
                        type = type == TIFFTag.TIFF_LONG ? TIFFTag.TIFF_LAZY_LONG : TIFFTag.TIFF_LAZY_LONG8;
                    }
                } else if (tag == BaselineTIFFTagSet.TAG_STRIP_OFFSETS ||
                           tag == BaselineTIFFTagSet.TAG_TILE_OFFSETS ||
                           tag == BaselineTIFFTagSet.TAG_JPEG_INTERCHANGE_FORMAT) {
                    this.stripOrTileOffsetsPosition =
                        stream.getStreamPosition();
                    if (LAZY_LOADING) {
                        type = type == TIFFTag.TIFF_LONG ? TIFFTag.TIFF_LAZY_LONG : TIFFTag.TIFF_LAZY_LONG8;
                    }
                }
    
                Object obj = null;
    
                try {
                    switch (type) {
                    case TIFFTag.TIFF_BYTE:
                    case TIFFTag.TIFF_SBYTE:
                    case TIFFTag.TIFF_UNDEFINED:
                    case TIFFTag.TIFF_ASCII:
                        byte[] bvalues = new byte[count];
                        stream.readFully(bvalues, 0, count);
                    
                        if (type == TIFFTag.TIFF_ASCII) {
                            // Can be multiple strings
                            final List v = new ArrayList();
                            boolean inString = false;
                            int prevIndex = 0;
                            for (int index = 0; index <= count; index++) {
                                if (index < count && bvalues[index] != 0) {
                                    if (!inString) {
                                    // start of string
                                        prevIndex = index;
                                        inString = true;
                                    }
                                } else { // null or special case at end of string
                                    if (inString) {
                                    // end of string
                                        final String s = new String(bvalues, prevIndex,index - prevIndex);
                                        v.add(s);
                                        inString = false;
                                    }
                                }
                            }
    
                            count = v.size();
                            String[] strings;
                            if(count != 0) {
                                strings = new String[count];
                                for (int c = 0 ; c < count; c++) {
                                    strings[c] = v.get(c);
                                }
                            } else {
                                // This case has been observed when the value of
                                // 'count' recorded in the field is non-zero but
                                // the value portion contains all nulls.
                                count = 1;
                                strings = new String[] {""};
                            }
                        
                            obj = strings;
                        } else {
                            obj = bvalues;
                        }
                        break;
                    
                    case TIFFTag.TIFF_SHORT:
                        char[] cvalues = new char[count];
                        for (int j = 0; j < count; j++) {
                            cvalues[j] = (char)(stream.readUnsignedShort());
                        }
                        obj = cvalues;
                        break;
                    
                    case TIFFTag.TIFF_LONG:
                    case TIFFTag.TIFF_IFD_POINTER:
                        long[] lvalues = new long[count];
                        for (int j = 0; j < count; j++) {
                            lvalues[j] = stream.readUnsignedInt();
                        }
                        obj = lvalues;
                        break;
                    
                    case TIFFTag.TIFF_RATIONAL:
                        long[][] llvalues = new long[count][2];
                        for (int j = 0; j < count; j++) {
                            llvalues[j][0] = stream.readUnsignedInt();
                            llvalues[j][1] = stream.readUnsignedInt();
                        }
                        obj = llvalues;
                        break;
                    
                    case TIFFTag.TIFF_SSHORT:
                        short[] svalues = new short[count];
                        for (int j = 0; j < count; j++) {
                            svalues[j] = stream.readShort();
                        }
                        obj = svalues;
                        break;
                    
                    case TIFFTag.TIFF_SLONG:
                        int[] ivalues = new int[count];
                        for (int j = 0; j < count; j++) {
                            ivalues[j] = stream.readInt();
                        }
                        obj = ivalues;
                        break;
                    
                    case TIFFTag.TIFF_SRATIONAL:
                        int[][] iivalues = new int[count][2];
                        for (int j = 0; j < count; j++) {
                            iivalues[j][0] = stream.readInt();
                            iivalues[j][1] = stream.readInt();
                        }
                        obj = iivalues;
                        break;
                    
                    case TIFFTag.TIFF_FLOAT:
                        float[] fvalues = new float[count];
                        for (int j = 0; j < count; j++) {
                            fvalues[j] = stream.readFloat();
                        }
                        obj = fvalues;
                        break;
                    
                    case TIFFTag.TIFF_DOUBLE:
                        double[] dvalues = new double[count];
                        for (int j = 0; j < count; j++) {
                            dvalues[j] = stream.readDouble();
                        }
                        obj = dvalues;
                        break;
                               
                    case TIFFTag.TIFF_LONG8:
                    case TIFFTag.TIFF_SLONG8:    
                    case TIFFTag.TIFF_IFD8:
                        long[] lBvalues = new long[count];
                        for (int j = 0; j < count; j++) {
                            lBvalues[j] = stream.readLong();
                        }
                        obj = lBvalues;
                        break;
                    
                    case TIFFTag.TIFF_LAZY_LONG8:   
                    case TIFFTag.TIFF_LAZY_LONG:   
                        obj = new TIFFLazyData(stream, type, count);
                        break;
                    default:
                        // XXX Warning
                        break;
                    }
                } catch(EOFException eofe) {
                    // The TIFF 6.0 fields have tag numbers less than or equal
                    // to 532 (ReferenceBlackWhite) or equal to 33432 (Copyright).
                    // If there is an error reading a baseline tag, then re-throw
                    // the exception and fail; otherwise continue with the next
                    // field.
                    if(BaselineTIFFTagSet.getInstance().getTag(tag) == null) {
                        throw eofe;
                    }
                }
                
                if (tiffTag == null) {
                    // XXX Warning: unknown tag
                } else if (!tiffTag.isDataTypeOK(type)) {
                    // XXX Warning: bad data type
                } else if (tiffTag.isIFDPointer() && obj != null) {
                    stream.mark();
                    stream.seek(((long[])obj)[0]);
    
                    List tagSets = new ArrayList(1);
                    tagSets.add(tiffTag.getTagSet());
                    TIFFIFD subIFD = new TIFFIFD(tagSets);
    
                    // XXX Use same ignore policy for sub-IFD fields?
                    subIFD.initialize(stream, ignoreUnknownFields);
                    obj = subIFD;
                    stream.reset();
                }
    
                if (tiffTag == null) {
                    tiffTag = new TIFFTag(null, tag, 1 << type, null);
                }
    
                // Add the field if its contents have been initialized which
                // will not be the case if an EOF was ignored above.
                if(obj != null) {
                    TIFFField f = new TIFFField(tiffTag, type, count, obj);
                    addTIFFField(f);
                }
    
                stream.seek(nextTagOffset);
            }
    
            this.lastPosition = stream.getStreamPosition();
        }
    View Code

    Tiff常用的Tag标签类有BaseLineTiffTagSet、FaxTiffTagSet、GeoTiffTagSet、EXIFPTiffTagSet、PrivateTIFFTagSet等。

    其中的GeoTiffTagSet用于geotiff的额外存储信息,在这里说明下,Geotiff是Tiff格式对Gis数据的一种存储支持,而PrivateTIFFTagSet是对gdal的支持,增加了NODATA、MEATADATA的信息。

     对于文章开头提的关于bandOffsets.length is wrong!,主要原因出现在getImageTypes(int imageIndex)这个方法的下面这个实现中。

    复制代码
    ImageTypeSpecifier itsRaw = 
                TIFFDecompressor.getRawImageTypeSpecifier
                    (photometricInterpretation,
                     compression,
                     samplesPerPixel,
                     bitsPerSample,
                     sampleFormat,
                     extraSamples,
                     colorMap);
    复制代码

    最终我们在ImageTypeSpecifier这个类的Interleaved(ColorSpace colorSpace,int[] bandOffsets,int dataType,boolean hasAlpha,boolean isAlphaPremultiplied) 方法中发现问题。

    复制代码
    public Interleaved(ColorSpace colorSpace,
                               int[] bandOffsets,
                               int dataType,
                               boolean hasAlpha,
                               boolean isAlphaPremultiplied) {
                if (colorSpace == null) {
                    throw new IllegalArgumentException("colorSpace == null!");
                }
                if (bandOffsets == null) {
                    throw new IllegalArgumentException("bandOffsets == null!");
                }
                int numBands = colorSpace.getNumComponents() +
                    (hasAlpha ? 1 : 0);
                if (bandOffsets.length != numBands) {
                    throw new IllegalArgumentException
                        ("bandOffsets.length is wrong!");
                }
    复制代码

    我们发现只有当我们的图像偏移数量和我们的通道数不一致的时候,就会报这个错误!

    总结

    通过研究这个问题,基本上梳理了Java基于ImageIO-ext读取tiff的过程,基本跟tiff的数据结构对应起来。

     

     

     

     

     

     

     



     

  • 相关阅读:
    java实现数据库:mbdb-基于文件存储的关系型数据库
    论文阅读 - Outlier detection in social networks leveraging community structure
    重塑电商科技版图:从传统架构迈向DDD的华丽蜕变之路
    金山办公:订阅为王?
    Node.js_基础知识(fs模块 - 文件操作)
    如何成就更高远控帧率和流畅度?向日葵SADDC算法浅析
    零基础入行IC,选模拟版图还是数字后端?
    Linux系统课程学习------概述
    [linux] depmod和 modprobe
    webpack-clean-plugin webpack-css-plugin 定义全局参数 webpack-压缩css代码
  • 原文地址:https://www.cnblogs.com/share-gis/p/16630396.html