VBA to get the href value

Multi tool use
Multi tool use


VBA to get the href value



I am writing macro to extract the href value from a website, example here is to extract the value: '/listedco/listconews/SEHK/2015/0429/LTN201504291355_C.pdf' from the html code below. The href is one of the attributes of the html tag 'a', I have add the code getElementbyTagName'a' but it did not work, my question is how to extract that href value to column L. Anyone could help? Thanks in advance!


<a id="ctl00_gvMain_ctl03_hlTitle" class="news" href="/listedco/listconews/SEHK/2015/0429/LTN201504291355_C.pdf" target="_blank">二零一四年年報</a>




Sub Download_From_HKEX()
Dim internetdata As Object
Dim div_result As Object
Dim header_links As Object
Dim link As Object
Dim URL As String
Dim IE As Object
Dim i As Object
Dim ieDoc As Object
Dim selectItems As Variant
Dim h As Variant

Dim LocalFileName As String
Dim B As Boolean
Dim ErrorText As String
Dim x As Variant

'Key Ratios
For x = 1 To 1579
Set IE = New InternetExplorerMedium
IE.Visible = True
URL = "http://www.hkexnews.hk/listedco/listconews/advancedsearch/search_active_main_c.aspx"
IE.navigate URL
Do
DoEvents
Loop Until IE.readyState = 4
Application.Wait (Now + TimeValue("0:00:05"))
Call IE.Document.getElementById("ctl00_txt_stock_code").setAttribute("value", Worksheets("Stocks").Cells(x, 1).Value)

Set selectItems = IE.Document.getElementsByName("ctl00$sel_tier_1")
For Each i In selectItems
i.Value = "4"
i.FireEvent ("onchange")
Next i

Set selectItems = IE.Document.getElementsByName("ctl00$sel_tier_2")
For Each i In selectItems
i.Value = "159"
i.FireEvent ("onchange")
Next i

Set selectItems = IE.Document.getElementsByName("ctl00$sel_DateOfReleaseFrom_d")
For Each i In selectItems
i.Value = "01"
i.FireEvent ("onchange")
Next i

Set selectItems = IE.Document.getElementsByName("ctl00$sel_DateOfReleaseFrom_m")
For Each i In selectItems
i.Value = "04"
i.FireEvent ("onchange")
Next i

Set selectItems = IE.Document.getElementsByName("ctl00$sel_DateOfReleaseFrom_y")
For Each i In selectItems
i.Value = "1999"
i.FireEvent ("onchange")
Next i

Application.Wait (Now + TimeValue("0:00:02"))
Set ieDoc = IE.Document
With ieDoc.forms(0)
Call IE.Document.parentWindow.execScript("document.forms[0].submit()", "JavaScript")
.submit
End With
Application.Wait (Now + TimeValue("0:00:03"))

'Start here to extract the href value.
Set internetdata = IE.Document
Set div_result = internetdata.getElementById("ctl00_gvMain_ctl03_hlTitle")
Set header_links = div_result.getElementsByTagName("a")
For Each h In header_links
Set link = h.ChildNodes.Item(0)
Worksheets("Stocks").Cells(Range("L" & Rows.Count).End(xlUp).Row + 1, 12) = link.href
Next
Next x
End Sub





Is there anyone could help?
– Nicholas Kan
Sep 20 '15 at 12:47





What is the problem you have encountered? It's not clear from your question. Edit your question to elaborate.
– omegastripes
Sep 20 '15 at 20:18





div_result.getElementsByClassName("a") >> div_result.getElementsByTagName("a")
– Tim Williams
Sep 21 '15 at 0:26


div_result.getElementsByClassName("a")


div_result.getElementsByTagName("a")





@TimWilliams Hi Tim, thanks for your answer but sorry it is just a typo. I did tried getElementsByTagName, as well as getElementsByClassName("news"), they did not work, a solution may be get the attribute "href" after getting the TagName"a", since the "href" is one of the attributes of tag "a". But I don't know the code to get attributes, could you help?
– Nicholas Kan
Sep 21 '15 at 1:02





Your anchor element with target href has id="ctl00_gvMain_ctl03_hlTitle, so you can retrirve url IE.document.getElementById("ctl00_gvMain_ctl03_hlTitle").href or simply IE.document.ctl00_gvMain_ctl03_hlTitle.href. Also try to retrieve the data you need via XHR instead of IE.
– omegastripes
Sep 21 '15 at 2:55



id="ctl00_gvMain_ctl03_hlTitle


IE.document.getElementById("ctl00_gvMain_ctl03_hlTitle").href


IE.document.ctl00_gvMain_ctl03_hlTitle.href




3 Answers
3


For Each h In header_links
Worksheets("Stocks").Cells(Range("L" & Rows.Count).End(xlUp).Row + 1, 12) = h.href
Next



EDIT: The id attribute is supposed to be unique in the document: there should only be a single element with any given id. So


id


IE.Document.getElementById("ctl00_gvMain_ctl03_hlTitle").href



should work.





@Tim Williams, Hi Tim, your code works for extracting all the href links on the website but what I want to extract is only those with id="ctl00_gvMain_ctl03_hlTitle", the href attribute is one of those attributes in the tag <a>, I think something like getAttribute would work but I don't know the coding for that, could you help again? Thanks for your patience!
– Nicholas Kan
Sep 21 '15 at 5:40



id="ctl00_gvMain_ctl03_hlTitle"





@Tim Hi Tim, I am currently using your firs answer which is h.href method, it works perfectly. But I did not test your updated answer. However, thanks a lot!
– Nicholas Kan
Sep 22 '15 at 15:57


h.href


WB.Document.GetElementById("ctl00_gvMain_ctl04_hlTitle").GetAttribute("href").ToString



Use a CSS selector to get the element then access its href attribute.


href


#ctl00_gvMain_ctl03_hlTitle



The above is element with id ctl00_gvMain_ctl03_hlTitle. "#" means id.


id ctl00_gvMain_ctl03_hlTitle


"#"


Debug.Print IE.document.querySelector("#ctl00_gvMain_ctl03_hlTitle").getAttribute("href").innerText






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

lgarvUb8za1bkKqWVEa,nAtNMtIu,8InWtjcD z6T7y,xjc 7e8aB
d,qa,7S ANM60PoCHfQOxpHfd,jyvw6E,MEk2b9D,v,8UY,QJqXxT,i5UuY2 brTEQlEvLQTqXegZCkA ckxRuJUj8YcPLMif,v7e Z61Wk

Popular posts from this blog

Delphi Android file open failure with API 26

.

Amasya