{ "nbformat": 4, "nbformat_minor": 5, "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" }, "colab": { "name": "browser-automationpy_tutorial.ipynb", "provenance": [], "collapsed_sections": [ "4kN09KPCXgqg", "MrYm4P4msulb", "OG9zfzE5s6ls", "61bb78cf", "Jd9NhHk1ygt-" ] } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "4kN09KPCXgqg" }, "source": [ "## Installation" ], "id": "4kN09KPCXgqg" }, { "cell_type": "markdown", "metadata": { "id": "MrYm4P4msulb" }, "source": [ "#### Required Dependencies" ], "id": "MrYm4P4msulb" }, { "cell_type": "code", "metadata": { "id": "ea03bc45" }, "source": [ "!pip install selenium\n", "!pip install webdriver-manager" ], "id": "ea03bc45", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "OG9zfzE5s6ls" }, "source": [ "#### installing the browser_automationpy" ], "id": "OG9zfzE5s6ls" }, { "cell_type": "code", "metadata": { "id": "nYxu08hgsx3R" }, "source": [ "!python3 -m pip install browser_automationpy" ], "id": "nYxu08hgsx3R", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "61bb78cf" }, "source": [ "## Browser setup" ], "id": "61bb78cf" }, { "cell_type": "code", "metadata": { "id": "BmeExfAogLjd" }, "source": [ "import browser_automation" ], "id": "BmeExfAogLjd", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "s0i2nVTKa6Cr" }, "source": [ "#### Custom Options : Optional" ], "id": "s0i2nVTKa6Cr" }, { "cell_type": "markdown", "metadata": { "id": "ZAUimZui4S7u" }, "source": [ "google colab Jupyter notebook Instruction : \n", "`Ctrl m m` will convert a code cell to a text cell. \n", " `Ctrl m y` will convert a text cell to a code cell. " ], "id": "ZAUimZui4S7u" }, { "cell_type": "markdown", "metadata": { "id": "qWJykC-3bEIr" }, "source": [ "If require browser with custom options, Please use below given code cell to output sample custom options templates." ], "id": "qWJykC-3bEIr" }, { "cell_type": "code", "metadata": { "id": "e658686a" }, "source": [ "# Outputs the json template files\n", "browser_automation.browser_setup.CustomOptions().get_default_custom_options_json()" ], "id": "e658686a", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "wD1Xk-lDmK_9" }, "source": [ "Edit the template based on your need and provide the file path in cell below. if filename and location is not changed no need to change anything." ], "id": "wD1Xk-lDmK_9" }, { "cell_type": "code", "metadata": { "id": "bda91a9a" }, "source": [ "custom_browser_option_json_path = './customoption.json'" ], "id": "bda91a9a", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "awrj7xPJy7VR" }, "source": [ "### Launch Browser" ], "id": "awrj7xPJy7VR" }, { "cell_type": "markdown", "metadata": { "id": "raKWxnxE9A8O" }, "source": [ " if you are using custom options put 'custom_browser_option_json_path' inside Interactions(here in below cell) " ], "id": "raKWxnxE9A8O" }, { "cell_type": "code", "metadata": { "id": "c870a6f9", "outputId": "79c57909-cad7-4ece-b390-1d21a019aeb2" }, "source": [ "temp_browser = browser_automation.browser_manipulation.Interactions()" ], "id": "c870a6f9", "execution_count": null, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "\n", "====== WebDriver manager ======\n", "Current google-chrome version is 84.0.4147\n", "Get LATEST driver version for 84.0.4147\n", "Driver [/home/chaudharyubuntu/.wdm/drivers/chromedriver/linux64/84.0.4147.30/chromedriver] found in cache\n" ] } ] }, { "cell_type": "code", "metadata": { "id": "b31d15d6" }, "source": [ "driver = temp_browser.driver" ], "id": "b31d15d6", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "-mezrgzEvOuA" }, "source": [ "Now you can use selenium driver methods with extra methods provided in this package" ], "id": "-mezrgzEvOuA" }, { "cell_type": "markdown", "metadata": { "id": "Jd9NhHk1ygt-" }, "source": [ "## How to start with browser automation" ], "id": "Jd9NhHk1ygt-" }, { "cell_type": "code", "metadata": { "id": "mUixKIzOvMrb" }, "source": [ "# opening a website\n", "website_link = \"https://selenium.dev\"\n", "driver.get(website_link)" ], "id": "mUixKIzOvMrb", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "0uQPy1OhHq2B" }, "source": [ "To interact with web elements like buttons and input boxes, addresses are required. check [selenium guide](https://www.selenium.dev/documentation/webdriver/locating_elements/) or extensions like- [selectorshub](https://selectorshub.com/ ), [selocity](https://www.ranorex.com/selocity/browser-extension/ ), [Chropath](https://chrome.google.com/webstore/detail/chropath/ljngjbnaijcbncmcnjfhigebomdlkcjo ) for locating web elements and get addresses. It's way easier than it might seem." ], "id": "0uQPy1OhHq2B" }, { "cell_type": "markdown", "metadata": { "id": "EgGFuEdZKqdG" }, "source": [ "element_locator_type : str \n", " these by selenium methods to locate web element. This function contains CLASS_NAME, CSS_SELECTOR, ID,\n", " LINK_TEXT, NAME, PARTIAL_LINK_TEXT, TAG_NAME, XPATH." ], "id": "EgGFuEdZKqdG" }, { "cell_type": "code", "metadata": { "id": "a41Fxx0NvNDW" }, "source": [ "# clicking on buttons\n", "address_of_xpath_type = \"//a[@class='nav-link']//span[contains(text(),'Documentation')]\"\n", "element_locator_type = \"xpath\"\n", "driver.click_on_button(element_locator_type, address_of_xpath_type)" ], "id": "a41Fxx0NvNDW", "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "rAiND43xvNPC" }, "source": [ "# putting text in input boxes\n", "address_of_xpath_type = \"//input[@role='combobox']\"\n", "element_locator_type = \"xpath\"\n", "driver.write_in_inputbox(\"xpath\", mdpi_search_inputbox_xpath, \"text to put in the box\")" ], "id": "rAiND43xvNPC", "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "Jn04iRHx8GiH" }, "source": [ "# scrolling the webpage\n", "#scroll_until : int This is the total amount of scrolling.\n", "#scroll_height : int This is how much scroll is happen in one go\n", "scroll_until = 400\n", "scroll_height = 300\n", "driver.scroll_page(scroll_until, scroll_height)" ], "id": "Jn04iRHx8GiH", "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "4H1QERpD6oM2" }, "source": [ "#code for sleeping for random time to minimise load on server.\n", "browser_automation.os_json_utils.RandomSleep().sleep_for_random_time()" ], "id": "4H1QERpD6oM2", "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "fR1TQRyq8Qfi" }, "source": [ "# renaming and moving files\n", "browser_automation.DownloadUtil().file_rename_and_moving('article_name', \"DOWNLOAD_LOCATION_path\", \"new_path\")" ], "id": "fR1TQRyq8Qfi", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "Yl4pAhvTvNal" }, "source": [ "For leaning more about browser automation check out https://www.selenium.dev/documentation/" ], "id": "Yl4pAhvTvNal" } ] }