Appium框架的客户端脚本中执行UI操作的原理是:脚本中需要执行UI操作时,会发送一条http请求(请求数据中包括了控件相关信息)到Appium的服务器,服务器再把接收到的数据转义一下,然后转发给安装在手机端的插桩程序。这时候插桩程序调用android sdk提供的uiautomator相关ui操作库来执行真正的UI操作。然后再把结果沿路一直返回到脚本中形成闭环。
下面注意分析下脚本中如何给Appium服务器发送http请求。
1、启动Appim服务
首先需要启动Appium服务,让Appium服务监听端口4723,这样脚本就可以往这个端口发送http请求了;
2、脚本中执行用例前需要创建webDriver对象
这个对象可以理解为appium提供给脚本中执行UI操作的封装函数库。
webDriver类的构造方法中会根据”desired capabilities“信息向appium服务器发起了一次请求,服务器拿到”desired capabilities“后会根据这些信息创建一个SessionId并返回给用例脚本。
并且Appium拿到”desired capabilities“后就能知道要和哪个连接PC的手机设备进行连接了。Appium服务在这里会做很多事情来确保和手机端的插桩服务程序连接成功。具体做了哪些事情可以参考我的另外一篇文章”appium 从启动到测试再到结束流程梳理“。
也许你会问什么要执行用例前要获取session id呢?
因为执行测试时,脚本用例本质上是给服务端发送http请求,但是http请求是无状态的,服务器收到的每条http请求都被认为和之前的请求没有任何关系。这会导致每一条http请求都必须带上”desired capabilities“信息,这样服务器才知道要和哪个手机设备通信,并且”desired capabilities“还有很多其他信息,才能确保appium服务按照”desired capabilities“指定的参数运行。即需要保证每条http请求的运行环境是一致的。
每一条http请求都带上”desired capabilities“信息这显然是不可取的。所以appium采取的是session机制。appium服务第一次拿到”desired capabilities“后会存在本地,并返回一个session id给脚本用例,这样后续的用例再发送http请求到appium服务,则appium服务根据session id就能知道对应哪个”desired capabilities“,然后就能知道运行环境是怎样的了(运行环境指的是和哪台设手机设备通信,用例执行延时或重试这些在”desired capabilities“中指定的信息)。
- self.caps = {}
- self.caps["platformName"] = "Android"
- self.caps["platformVersion"] = devices.dev[Constant.phone]["platformVersion"]
- self.caps["deviceName"] = devices.dev[Constant.phone]["phone"]
- self.caps["appPackage"] = Constant.appPackage
- self.caps["appActivity"] = Constant.appActivity
- self.caps['app'] = Constant.app
- self.caps["unicodeKeyboard"] = True
- self.caps["autoAcceptAlerts"] = True # 对权限弹窗进行授权
- self.caps["resetKeyboard"] = True
- self.caps["noReset"] = True
- self.caps["newCommandTimeout"] = 6000
- self.driver = webdriver.Remote('http://127.0.0.1:4723/wd/hub', self.caps) # localhost
上面是在appium的脚本中用一个字典来存储”desired capabilities“,然后创建了一个webDriver对象。
'http://127.0.0.1:4723/wd/hub'代表的是向本机的4723端口发送请求,appium服务运行时监听的端口就是4723,url中的路径部分/wd/hub,其中wd是webdriver的缩写,hub表示中心节点。这些在appium服务的node.js源码中能找到对应路径。
- class WebDriver(webdriver.Remote):
- def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
- desired_capabilities=None, browser_profile=None, proxy=None, keep_alive=False):
-
- super(WebDriver, self).__init__(command_executor, desired_capabilities, browser_profile, proxy, keep_alive)
-
- if self.command_executor is not None:
- self._addCommands()
- class WebDriver(object):
- def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
- desired_capabilities=None, browser_profile=None, proxy=None,
- keep_alive=False, file_detector=None):
- ...
- if type(self.command_executor) is bytes or isinstance(self.command_executor, str):
- self.command_executor = RemoteConnection(command_executor, keep_alive=keep_alive)
- ...
- self.start_session(desired_capabilities, browser_profile)
- ...
先看简单的吧,当self.command_executor不为None时,调用self._addCommands()往self.command_executor._commands命令映射列表中新增一些命令。因为appium是基于selenium二次开发的,self.command_executor._commands是selenium框架中原有的命令字映射表,appium在这基础上新增了一些。
然后构造器中主要就是创建RemoteConnection对象和start_session()。
1、创建RemoteConnection对象并赋值给command_executor属性
来看看RemoteConnection的构造器,发现其内部主要是检查我们最开始传入的url的参数格式是否正确。并把url解析出来后赋值给自己的属性保存。并使用_commands字典来保存http请求中请求方法和url路径(这个应该是为后面做准备吧)。形式为:命令描述字符串-->(http请求方法,http请求路径)。 这样只需要只要命令描述字符串就知道了命令请求方法和路径。
- self._commands = {
- Command.STATUS: ('GET', '/status'),
- Command.NEW_SESSION: ('POST', '/session'),
- Command.GET_ALL_SESSIONS: ('GET', '/sessions'),
- Command.QUIT: ('DELETE', '/session/$sessionId'),
- Command.GET_CURRENT_WINDOW_HANDLE:
- ('GET', '/session/$sessionId/window_handle'),
- Command.GET_WINDOW_HANDLES:
- ('GET', '/session/$sessionId/window_handles'),
- Command.GET: ('POST', '/session/$sessionId/url'),
- Command.GO_FORWARD: ('POST', '/session/$sessionId/forward'),
- Command.GO_BACK: ('POST', '/session/$sessionId/back'),
- Command.REFRESH: ('POST', '/session/$sessionId/refresh'),
- Command.EXECUTE_SCRIPT: ('POST', '/session/$sessionId/execute'),
- Command.GET_CURRENT_URL: ('GET', '/session/$sessionId/url'),
- Command.GET_TITLE: ('GET', '/session/$sessionId/title'),
- Command.GET_PAGE_SOURCE: ('GET', '/session/$sessionId/source'),
- Command.SCREENSHOT: ('GET', '/session/$sessionId/screenshot'),
- Command.ELEMENT_SCREENSHOT: ('GET', '/session/$sessionId/element/$id/screenshot'),
- Command.FIND_ELEMENT: ('POST', '/session/$sessionId/element'),
- Command.FIND_ELEMENTS: ('POST', '/session/$sessionId/elements'),
- ...
Command类保存了用例中所有的命令
- class Command(object):
- """
- Defines constants for the standard WebDriver commands.
- While these constants have no meaning in and of themselves, they are
- used to marshal commands through a service that implements WebDriver's
- remote wire protocol:
- https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol
- """
-
- # Keep in sync with org.openqa.selenium.remote.DriverCommand
-
- STATUS = "status"
- NEW_SESSION = "newSession"
- GET_ALL_SESSIONS = "getAllSessions"
- DELETE_SESSION = "deleteSession"
- CLOSE = "close"
- QUIT = "quit"
- GET = "get"
- GO_BACK = "goBack"
- GO_FORWARD = "goForward"
- REFRESH = "refresh"
- ADD_COOKIE = "addCookie"
- GET_COOKIE = "getCookie"
- GET_ALL_COOKIES = "getCookies"
- DELETE_COOKIE = "deleteCookie"
- DELETE_ALL_COOKIES = "deleteAllCookies"
- FIND_ELEMENT = "findElement"
- FIND_ELEMENTS = "findElements"
- FIND_CHILD_ELEMENT = "findChildElement"
- FIND_CHILD_ELEMENTS = "findChildElements"
- CLEAR_ELEMENT = "clearElement"
- CLICK_ELEMENT = "clickElement"
- SEND_KEYS_TO_ELEMENT = "sendKeysToElement"
- SEND_KEYS_TO_ACTIVE_ELEMENT = "sendKeysToActiveElement"
- SUBMIT_ELEMENT = "submitElement"
- UPLOAD_FILE = "uploadFile"
- GET_CURRENT_WINDOW_HANDLE = "getCurrentWindowHandle"
- GET_WINDOW_HANDLES = "getWindowHandles"
- ...
这样就把命令关键字和命令关联起来起来了。
2、执行start_session(desired_capabilities, browser_profile)生成session id
- def start_session(self, desired_capabilities, browser_profile=None):
- """
- Creates a new session with the desired capabilities.
- :Args:
- - browser_name - The name of the browser to request.
- - version - Which browser version to request.
- - platform - Which platform to request the browser on.
- - javascript_enabled - Whether the new session should support JavaScript.
- - browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object. Only used if Firefox is requested.
- """
- capabilities = {'desiredCapabilities': {}, 'requiredCapabilities': {}}
- for k, v in desired_capabilities.items():
- if k not in ('desiredCapabilities', 'requiredCapabilities'):
- capabilities['desiredCapabilities'][k] = v
- else:
- capabilities[k].update(v)
- if browser_profile:
- capabilities['desiredCapabilities']['firefox_profile'] = browser_profile.encoded
- response = self.execute(Command.NEW_SESSION, capabilities)
- if 'sessionId' not in response:
- response = response['value']
- self.session_id = response['sessionId']
- self.capabilities = response['value']
-
- # Quick check to see if we have a W3C Compliant browser
- self.w3c = response.get('status') is None
其中response = self.execute(Command.NEW_SESSION, capabilities)执行。
appium中发送http请求都是在excute()方法中执行。excute()中又执行了command_executor.execute(), 最终在这个execute()中调用request发送http请求。
- def execute(self, driver_command, params=None):
- """
- Sends a command to be executed by a command.CommandExecutor.
-
- :Args:
- - driver_command: The name of the command to execute as a string.
- - params: A dictionary of named parameters to send with the command.
-
- :Returns:
- The command's JSON response loaded into a dictionary object.
- """
- if self.session_id is not None:
- if not params:
- params = {'sessionId': self.session_id}
- elif 'sessionId' not in params:
- params['sessionId'] = self.session_id
-
- params = self._wrap_value(params)
- response = self.command_executor.execute(driver_command, params)
- if response:
- self.error_handler.check_response(response)
- response['value'] = self._unwrap_value(
- response.get('value', None))
- return response
- # If the server doesn't send a response, assume the command was
- # a success
- return {'success': 0, 'value': None, 'sessionId': self.session_id}
可以看到,函数内部首先会检查session_id是不是为none。在获取sessioid的时候,这个session_id是none,直接走下面的逻辑。 获取完session之后的所有请求,sessioid不为null,则会检查参数params加上sessionid参数。所以服务器就知道了请求来自哪个客户端。
原来客户端的session id是在这里获取,并每次请求时在这里加上session id的呀!
调用流程有点复杂,来个流程图吧。
用一个基本的点击操作来梳理这个过程。
self.driver.find_element_by_id("xxx").click()
1、driver.find_element_by_id()获取控件
- def find_element_by_id(self, id_):
- """Finds an element by id.
- :Args:
- - id\_ - The id of the element to be found.
- :Usage:
- driver.find_element_by_id('foo')
- """
- return self.find_element(by=By.ID, value=id_)
其内部调用的是自身的find_element()方法
- def find_element(self, by=By.ID, value=None):
- """
- 'Private' method used by the find_element_by_* methods.
- :Usage:
- Use the corresponding find_element_by_* instead of this.
- :rtype: WebElement
- """
- if self.w3c:
- if by == By.ID:
- by = By.CSS_SELECTOR
- value = '[id="%s"]' % value
- elif by == By.TAG_NAME:
- by = By.CSS_SELECTOR
- elif by == By.CLASS_NAME:
- by = By.CSS_SELECTOR
- value = ".%s" % value
- elif by == By.NAME:
- by = By.CSS_SELECTOR
- value = '[name="%s"]' % value
- return self.execute(Command.FIND_ELEMENT, {
- 'using': by,
- 'value': value})['value']
这里根据不同查找方式,对by和value参数进行了处理,然后再调用自身的excute()方法。注意看注释里的rtype: WebElement,说明find_element()返回的是一个WebElement对象。
self.execute(Command.FIND_ELEMENT, {'using': by,'value': value})['value']
就需要去excute()中看看是如何返回一个WebElement对象了。
- def execute(self, driver_command, params=None):
- """
- Sends a command to be executed by a command.CommandExecutor.
- :Args:
- - driver_command: The name of the command to execute as a string.
- - params: A dictionary of named parameters to send with the command.
- :Returns:
- The command's JSON response loaded into a dictionary object.
- """
- if self.session_id is not None:
- if not params:
- params = {'sessionId': self.session_id}
- elif 'sessionId' not in params:
- params['sessionId'] = self.session_id
-
- params = self._wrap_value(params)
- response = self.command_executor.execute(driver_command, params)
- if response:
- self.error_handler.check_response(response)
- response['value'] = self._unwrap_value(
- response.get('value', None))
- return response
- # If the server doesn't send a response, assume the command was
- # a success
- return {'success': 0, 'value': None, 'sessionId': self.session_id}
通过注释可以看到,excute()发送了一条需要被执行的命令到command.CommandExecutor,然后得到返回结果response。并且response是json格式的字典类型。
excute()方法前面是给params参数加上session id,这个在前面已经分析过了。
然后是包装params参数,并执行command_executor.execute(driver_command, params)得到返回response。然后再检查response格式这些是否正常,如果response中不包含错误,则对response中的value进行解包装。就是在这个地方生成WebElement对象的。
- def _unwrap_value(self, value):
- if isinstance(value, dict) and ('ELEMENT' in value or 'element-6066-11e4-a52e-4f735466cecf' in value):
- wrapped_id = value.get('ELEMENT', None)
- if wrapped_id:
- return self.create_web_element(value['ELEMENT'])
- else:
- return self.create_web_element(value['element-6066-11e4-a52e-4f735466cecf'])
-
- elif isinstance(value, list):
- return list(self._unwrap_value(item) for item in value)
- else:
- return value
其中create_web_element()内部实现为
- def create_web_element(self, element_id):
- """Creates a web element with the specified `element_id`."""
- return self._web_element_cls(self, element_id, w3c=self.w3c)
而_web_element_cls为
_web_element_cls = WebElement
所以,终于明白了是如何调用find_element_by_id()一步步如何最终获取到WebElement了。
2、WebElement对象上执行click()点击
- def click(self):
- """Clicks the element."""
- self._execute(Command.CLICK_ELEMENT)
进入到_excute()
- # Private Methods
- def _execute(self, command, params=None):
- """Executes a command against the underlying HTML element.
- Args:
- command: The name of the command to _execute as a string.
- params: A dictionary of named parameters to send with the command.
- Returns:
- The command's JSON response loaded into a dictionary object.
- """
- if not params:
- params = {}
- params['id'] = self._id
- return self._parent.execute(command, params)
self._parent是什么对象呢?在WebElement的构造器中,self._parent是构造器传入的第一个参数
- class WebElement(object):
-
- def __init__(self, parent, id_, w3c=False):
- self._parent = parent
- self._id = id_
- self._w3c = w3c
那么再回到上面创建WebElement的地方,发现传入的是WebDriver对象
- def create_web_element(self, element_id):
- """Creates a web element with the specified `element_id`."""
- return self._web_element_cls(self, element_id, w3c=self.w3c)
-
- def _unwrap_value(self, value):
- if isinstance(value, dict) and ('ELEMENT' in value or 'element-6066-11e4-a52e-4f735466cecf' in value):
- wrapped_id = value.get('ELEMENT', None)
- if wrapped_id:
- return self.create_web_element(value['ELEMENT'])
- else:
- return self.create_web_element(value['element-6066-11e4-a52e-4f735466cecf'])
-
- elif isinstance(value, list):
- return list(self._unwrap_value(item) for item in value)
- else:
- return value
所以self._parent.execute(command, params)调用的还是webdriver对象的excute()方法。
到这里,就理清了控件是如何点击的,其本质也是向appium服务发送一个http请求。