Using BeautifulSoup to Search Yahoo Finance Statistics Page

Multi tool use
Multi tool use


Using BeautifulSoup to Search Yahoo Finance Statistics Page



I am trying to scrape data out of the Yahoo Finance Statistics Page.
In this instance, it is the "5 Year Average Dividend Yield".
The data that I need is in this type of format.


<tr>
<td>
<span>5 Year Average Dividend Yield</span>
</td>
<td class="Fz(s) Fw(500) Ta(end)">6.16</td>
</tr>



I'm new to beautifulsoup and I'm trying to read the bs4 doco, but have had no luck yet so far.
I just realised that I was parsing through a table. (Yes, I'm a noob).



Here's my code so far. It successfully prints out all the rows in the table.
I need help with isolating the row that contains "5 Year Average Dividend Yield".
I just need the numerical value in the next column.
Thanks in advance.



New edit: I've placed version 0.8 below which gets the "5 Year Average Dividend Yield" value that I was looking for.


# Version 0.8 - This worked. It got the value for "5 Year Average Dividend Yield"
# Aim: Find value for"5 Year Average Dividend Yield".

import csv, os, time
import sys
from bs4 import BeautifulSoup
import urllib
import xlsxwriter
from selenium import webdriver
from importlib import reload

file_path = "C:/temp/temp29/"
file_name = "ASX_20180621_lite.txt"
file_path_name = file_path + file_name
print(file_path_name)

# Phase 1 - place all ticker symbols into an array
tickers_phase1_arr =

with open(file_path_name, "rt") as incsv:
readcsv = csv.reader(incsv, delimiter=',')
rownum = 0
colnum = 0
for row in readcsv:
ticker_phase1 = row[rownum]
ticker_dot_ax = ticker_phase1 + ".AX"
tickers_phase1_arr.append(ticker_dot_ax)
#print(ticker)
rownum + 1
print(tickers_phase1_arr)


# Phase 2
key_stats_on_stat = ['5 Year Average Dividend Yield']


#Initialise the browser
browser = webdriver.PhantomJS()

tickers_phase2_arr =
data = {}

for ticker_phase2 in tickers_phase1_arr:
print(ticker_phase2)
#time.sleep(5)
#Set the main and stats url
url = "https://finance.yahoo.com/quote/{0}/key-statistics?p={0}".format(ticker_phase2)
#START - This block of code scrapes for the Previous Code value in the Main Page
browser.get(url)
# Run a script that gets all the html in the webpage that the browser got from the get request
innerHTML = browser.execute_script("return document.body.innerHTML")
#Turn innerHTML into a BeautifulSoup object to make the components easier to access for scraping
soup = BeautifulSoup(innerHTML, 'html.parser')
# Find the Previous Close value
for stat in key_stats_on_stat:
page_stat = soup.find(text=stat)
try:
page_row = page_stat.find_parent('tr')
try:
page_statnum = page_row.find_all('span')[1].contents[0]
except:
page_statnum = page_row.find_all('td')[1].contents[0]
except:
print('Invalid parent for this element')
page_statnum = "N/A"
print(page_statnum)




1 Answer
1



There are a few ways you can reach the td element containing the desired value from the previous td element. One of them would be to first get the span element in the first column and then use find_next() to find the next td element:


td


td


span


find_next()


td


tr.find(text='5 Year Average Dividend Yield').find_next('td').get_text()



where tr represents the current row.


tr



Another approach might scale a bit better. If you would need to do this kind of requests often, you may construct a dictionary having texts of the elements in the first column as keys and second column elements as values:


data = {}
for tr in soup.find('table').find_all('tr'):
first_cell, second_cell = tr.find_all('td')[:2]

data[first_cell.get_text(strip=True)] = second_cell.get_text(strip=True)



Then, you can query data by the text of the first column:


data


print(data['5 Year Average Dividend Yield'])





Hi Alec. Thanks for your help with this one. I tried running the second option, however it wasn't finding any data. However, your comment about getting the span, then the td. Got me to thinking. I've posted an updated version of the code and it picks up the value that I'm looking for. Cheers, Joe
– JiggidyJoe
Jul 1 at 4:11






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

4VOuykNmdFzpQMTJp9Vh1Vch rU kv25dW R5zjfAKDPBir
Igi c,k20iN,s

Popular posts from this blog

Delphi Android file open failure with API 26

.

Amasya