• XMLDecoder解析流程分析


    XMLDecoder解析流程分析

    最开始是准备入手CVE-2018-10271的,结果百度搜了很多文章都是到MXLDecoder.readObject()就结束了。而我在此之前并没有接触过XMLDecoder,所以只能回头补课了。

    首先写一个Demo进行调试

    XMLDecoderDemo01

    import java.io.BufferedInputStream;
    import java.io.File;
    import java.io.FileInputStream;
    import java.beans.XMLDecoder;
    
    public class XMLDecoderDemo01 {
    
        public static void XMLDecode_Deserialize(String path) throws Exception {
            File file = new File(path);
            FileInputStream fis = new FileInputStream(file);
            BufferedInputStream bis = new BufferedInputStream(fis);
            XMLDecoder xd = new XMLDecoder(bis);
            xd.readObject(); // 在这打断点
            xd.close();
        }
    
    
        public static void main(String[] args){
            //XMLDecode Deserialize Test
            String path = System.getProperty("user.dir") + "\\src\\pocDemo01.xml";
    
            try {
                
                XMLDecode_Deserialize(path);
    
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30

    然后就是xml文档了。

    pocDemo01.xml

    
    <java version="1.8.0_131" class="java.beans.XMLDecoder">
        <object class="java.lang.ProcessBuilder">
            <array class="java.lang.String" length="1">
                <void index="0">
                    <string>calcstring>
                void>
            array>
            <void method="start" />
        object>
    java>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    非常简单的一个xml文档,通过ProcessBuilder启动计算器calc

    过程分析

    进入debug模式,首先进入眼帘的是DocumentHandler.parse()方法,可以通过调用栈找到。

    //DocumentHandler.java
    public void parse(final InputSource var1) {
        if (this.acc == null && null != System.getSecurityManager()) {
            throw new SecurityException("AccessControlContext is not set");
        } else {
            AccessControlContext var2 = AccessController.getContext();
            SharedSecrets.getJavaSecurityAccess().doIntersectionPrivilege(new PrivilegedAction<Void>() {
                public Void run() {
                    try {
     /*看这行*/          SAXParserFactory.newInstance().newSAXParser().parse(var1, DocumentHandler.this);
                    } catch (ParserConfigurationException var3) {
                        DocumentHandler.this.handleException(var3);
                    } catch (SAXException var4) {
                        Object var2 = var4.getException();
                        if (var2 == null) {
                            var2 = var4;
                        }
    
                        DocumentHandler.this.handleException((Exception)var2);
                    } catch (IOException var5) {
                        DocumentHandler.this.handleException(var5);
                    }
    
                    return null;
                }
            }, var2, this.acc);
        }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28

    doIntersectionPrivilege()方法里的参数action是一个匿名类,而这个匿名类的run()方法会实例化SAXParser并调用其parse()方法,这是我们下一步的入口函数。

    //SAXParserImpl.java
    public void parse(InputSource is, DefaultHandler dh)
        throws SAXException, IOException {
        if (is == null) {
            throw new IllegalArgumentException();
        }
        if (dh != null) {
            xmlReader.setContentHandler(dh);
            xmlReader.setEntityResolver(dh);
            xmlReader.setErrorHandler(dh);
            xmlReader.setDTDHandler(dh);
            xmlReader.setDocumentHandler(null);
        }
        xmlReader.parse(is);
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    这里的xmlReaderSAXParserImpl的一个内部类JAXPSAXParser

    //SAXParserImpl$JAXPSAXParser
    public void parse(InputSource inputSource)
        throws SAXException, IOException {
        if (fSAXParser != null && fSAXParser.fSchemaValidator != null) {
            if (fSAXParser.fSchemaValidationManager != null) {
                fSAXParser.fSchemaValidationManager.reset();
                fSAXParser.fUnparsedEntityHandler.reset();
            }
            resetSchemaValidator();
        }
        super.parse(inputSource);
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    在之后会经过一系列的parse()方法,这些parse()方法主要是为了以后的解析过程做准备,一直到XML11Configuration.parse()方法才开始进行解析。

    //XML11Configuration.java
    public boolean parse(boolean complete) throws XNIException, IOException{
        
        /*...*/
        
        try {
                return fCurrentScanner.scanDocument(complete);
            } catch (XNIException ex) {}
        
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    顾名思义scanDocument()方法就是扫描我们的pocDemo01.xml文档的。

    //XMLDocumentFragmentScannerImpl.java
    /**
     * Scans a document.
     *
     * @param complete True if the scanner should scan the document
     *                 completely, pushing all events to the registered
     *                 document handler. A value of false indicates that
     *                 that the scanner should only scan the next portion
     *                 of the document and return. A scanner instance is
     *                 permitted to completely scan a document if it does
     *                 not support this "pull" scanning model.
     *
     * @return True if there is more to scan, false otherwise.
     */
    public boolean scanDocument(boolean complete)
    throws IOException, XNIException {
    
        // keep dispatching "events"
        fEntityManager.setEntityHandler(this);
        //System.out.println(" get Document Handler in NSDocumentHandler " + fDocumentHandler );
    
        int event = next();
        do {
            switch (event) {
                case XMLStreamConstants.START_DOCUMENT :
                    break;
                case XMLStreamConstants.ATTRIBUTE :
                    break;
                case XMLStreamConstants.END_ELEMENT :
                    //do not give callback here.
                    //this callback is given in scanEndElement function.
                    //fDocumentHandler.endElement(getElementQName(),null);
                    break;
                default :
                    throw new InternalError("processing event: " + event);
            }
            event = next();
        } while (event!=XMLStreamConstants.END_DOCUMENT && complete);
    
        if(event == XMLStreamConstants.END_DOCUMENT) {
            fDocumentHandler.endDocument(null);
            return false;
        }
    
        return true;
    
    } // scanDocument(boolean):boolean
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47

    这里一些老的文章在分析XMLStreamConstants.END_ELEMENT,但是可以看到已经被注释掉了。根据注释原来的处理方法被移动到scanEndElement function,在IDEA搜索scanEndElement函数,通过IDEA的usage功能,找到在next()方法里对该函数进行了调用,在scanEndElement()方法处打上断点。

    来看一下next()的注释

    /**
     * Drives the parser to the next state/event on the input. Parser is guaranteed
     * to stop at the next state/event. Internally XML document
     * is divided into several states. Each state represents a sections of XML
     * document. When this functions returns normally, it has read the section
     * of XML document and returns the state corresponding to section of
     * document which has been read. For optimizations, a particular driver
     * can read ahead of the section of document (state returned) just read and
     * can maintain a different internal state.
     *
     * State returned corresponds to Stax states.
     *
     * @return state representing the section of document just read.
     *
     * @throws IOException  Thrown on i/o error.
     * @throws XNIException Thrown on parse error.
     */
    
    public int next() throws IOException, XNIException{}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19

    得知该函数就是解析XML文档的函数。因为其过程过于复杂,详细可看XMLDecoder 解析流程(详细讲解)

    简单来说,next()中每次解析出的结果都会有对应的Handler进行处理。

    XML文档的解析是从里到外解析的,在本例中pocDemo01.xml就是先解析calc,往外一层的 ,以此类推,我们直接看到的解析部分

    //XMLDocumentFragmentScannerImpl$FragmentContentDriver
    public int next() throws IOException, XNIException{
        switch(fScannerState){
    
            case XMLEvent.START_DOCUMENT :
                return XMLEvent.START_DOCUMENT;
    
            case SCANNER_STATE_START_ELEMENT_TAG :{
    
                //xxx this function returns true when element is empty.. can be linked to end element event.
                //returns true if the element is empty
                fEmptyElement = scanStartElement() ;
                //if the element is empty the next event is "end element"
                if(fEmptyElement){
                    setScannerState(SCANNER_STATE_END_ELEMENT_TAG);
                }else{
                    //set the next possible state
                    setScannerState(SCANNER_STATE_CONTENT);
                }
                return XMLEvent.START_ELEMENT ;
            }
        }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23

    scanStartElement()方法进入

    //XMLDocumentFragmentScannerImpl.java
    protected boolean scanStartElement()
        throws IOException, XNIException {
        // call handler
        if (fDocumentHandler != null) {
            fDocumentHandler.emptyElement(fElementQName, fAttributes, null);
        }
    }
    
    //AbstractXMLDocumentParser.java
    public void emptyElement(QName element, XMLAttributes attributes, Augmentations augs)
            throws XNIException {
    
            startElement(element, attributes, augs);
            endElement(element, augs);
    
        } 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    这里fDocumentHandlerSAXParserImpl$JAXPSAXParser,因继承关系会调用父类AbstractXMLDocumentParser.emptyElement()方法。

    不过在AbstractXMLDocumentParser中并没有endElement()方法的具体实现,因此会从继承关系向下一层寻找endElement()方法,而找到的这个类就是AbstractSAXParser

    //AbstractSAXParser.java
    /**
    * The end of an element.
    *
    * @param element The name of the element.
    * @param augs     Additional information that may include infoset augmentations
    *
    * @throws XNIException Thrown by handler to signal an error.
    */
    public void endElement(QName element, Augmentations augs) throws XNIException {
    
        try {
            // SAX2
            if (fContentHandler != null) {
                fAugmentations = augs;
                String uri = element.uri != null ? element.uri : "";
                String localpart = fNamespaces ? element.localpart : "";
                fContentHandler.endElement(uri, localpart, element.rawname);
            }
        }
        catch (SAXException e) {
            throw new XNIException(e);
        }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    从注释得知endElement()方法将会结束一个Element的解析,这里的fContentHandlerDocumentHandler,跟进其endElement()方法。

    //DocumentHandler.java
    public void endElement(String var1, String var2, String var3) {
        try {
            this.handler.endElement();
        } catch (RuntimeException var8) {
            this.handleException(var8);
        } finally {
            this.handler = this.handler.getParent();
        }
    
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    DocumentHandler是一个关键类。

    DocumentHandler继承自DefaultHandlerDefaultHandler是使用SAX进行XML解析的默认Handler,所以Weblogic在对XML对象进行validate的时候也使用了SAX,保证过程的一致性。

    DefaultHandler实现了EntityResolver, DTDHandler, ContentHandler, ErrorHandler四个接口。

    DocumentHandler主要改写了ContentHandler中的几个接口,毕竟主要是针对内容进行解析的,其它的保留默认就好。

    这里的this.handlerVoidElementHandler,但是VoidElementHandler并没有实现endElement()方法,所以向上父类寻找endElement()方法。

    package com.sun.beans.decoder;
    
    final class VoidElementHandler extends ObjectElementHandler {
        VoidElementHandler() {
        }
    
        protected boolean isArgument() {
            return false;
        }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    向上找的父类ObjectElementHandler也没有实现endElement()方法,所以继续向上找ObjectElementHandler的父类NewElementHandler类,NewElementHandler也没有最后在所有Element的基类ElementHandler找到了方法实现。

    //ElementHandler.java
    public abstract class ElementHandler {
        public void endElement() {
            ValueObject var1 = this.getValueObject();
            if (!var1.isVoid()) {
                if (this.id != null) {
                    this.owner.setVariable(this.id, var1.getValue());
                }
    
                if (this.isArgument()) {
                    if (this.parent != null) {
                        this.parent.addArgument(var1.getValue());
                    } else {
                        this.owner.addObject(var1.getValue());
                    }
                }
            }
        }
    
        protected abstract ValueObject getValueObject();
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21

    然而紧接着的this.getValueObject()却又得从VoidElementHandler向上找了,因为ElementHandler没有getValueObject()的具体实现。

    根据Java的规则

    java子类调用父类的方法:

    1. 子类的对象调用方法时,会首先在子类中查找,如果子类中没有该方法,再到父类中查找;
    2. 如果该方法中又调用了其他方法,那么还是按照之前的顺序,先在子类中查找,再在父类中查找。

    先找VoidElementHandler的父类ObjectElementHandler,然而并没有getValueObject()的实现,继续向上在NewElementHandler里找到了。

    //NewElementHandler.java
    protected final ValueObject getValueObject() {
        if (this.arguments != null) {
            try {
                this.value = this.getValueObject(this.type, this.arguments.toArray());
            } catch (Exception var5) {
                this.getOwner().handleException(var5);
            } finally {
                this.arguments = null;
            }
        }
    
        return this.value;
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    结果立马又是一个getValueObject(this.type, this.arguments.toArray()),又从VoidElementHandler向上找了,这回直接在ObjectElementHandler里找到了getValueObject(Class var1, Object[] var2)的实现。

    这个父类方法调用链感觉还是有点复杂的。

    //ObjectElementHandler.java
    protected final ValueObject getValueObject(Class<?> var1, Object[] var2) throws Exception {
        if (this.field != null) {
            return ValueObjectImpl.create(FieldElementHandler.getFieldValue(this.getContextBean(), this.field));
        } else if (this.idref != null) {
            return ValueObjectImpl.create(this.getVariable(this.idref));
        } else {
            Object var3 = this.getContextBean();
        }
            String var4;
            var4 = this.method != null && 0 < this.method.length() ? this.method : "new";
    
            Expression var5 = new Expression(var3, var4, var2);
            return ValueObjectImpl.create(var5.getValue());
        }
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16

    通过变量窗口可知通过var3=ProcessBuildervar4="start"var2=Object[0]得到了一个Expression,这个Expression通过指定的target对象调用其methodName方法,并传参arguments。在这里分别依次对应var3var4var2

    /**
     * Creates a new {@link Expression} object
     * for the specified target object to invoke the method
     * specified by the name and by the array of arguments.
     * 

    * The {@code target} and the {@code methodName} values should not be {@code null}. * Otherwise an attempt to execute this {@code Expression} * will result in a {@code NullPointerException}. * If the {@code arguments} value is {@code null}, * an empty array is used as the value of the {@code arguments} property. * * @param target the target object of this expression * @param methodName the name of the method to invoke on the specified target * @param arguments the array of arguments to invoke the specified method * * @see #getValue */ @ConstructorProperties({"target", "methodName", "arguments"}) public Expression(Object target, String methodName, Object[] arguments) { super(target, methodName, arguments); }

    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21

    最后通过Expression启动start()方法。

    参考文章:

  • 相关阅读:
    Day59 下一个更大元素Ⅱ + 接雨水
    SSM2
    Acwing1015. 摘花生
    windows SDK编程 --- 消息之键盘消息(4)
    详解小程序配置服务器域名
    【JSP】EL表达式
    460. LFU 缓存
    Stimulsoft Reports.WEB 2023.4.2 Crack
    机器学习-聚类算法
    力扣 827. 最大人工岛
  • 原文地址:https://blog.csdn.net/qq_40710190/article/details/125900560