question

MalikAsadMahmood-8962 avatar image
0 Votes"
MalikAsadMahmood-8962 asked XingyuZhao-MSFT commented

HtmlAgilityPack Data read from investing.com

Hi ,
Please I need help to read html table (EPS earning table)80343-tabledata.jpg from following URL
https://www.investing.com/equities/wrldcal-teleco-earnings and insert into datatable my code as following its connecting and return data from investing as I have zero knowledge in htmlaglibitypack using.

my code as follows
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
Using client As New WebClient()
client.Headers.Add("user-agent", "karen payne")
ServicePointManager.SecurityProtocol = CType(3072, SecurityProtocolType)
ServicePointManager.DefaultConnectionLimit = 9999
Dim page As String = client.DownloadString(siteAddress)
' TextBox1.Text = htmlCode


         '  Dim page As String = WebClient.DownloadString("https://www.investing.com/equities/wrldcal-teleco")
         Dim doc As HtmlAgilityPack.HtmlDocument = New HtmlAgilityPack.HtmlDocument()
         doc.LoadHtml(page)

         Dim lstNode As List(Of HtmlNode) = New List(Of HtmlNode)
         Dim lstName As List(Of String) = New List(Of String)
         Dim lstTable As List(Of DataTable) = New List(Of DataTable)



         For Each thed As HtmlNode In doc.DocumentNode.SelectNodes("//thead")
             lstNode.Add(thed.ParentNode)
             Dim text = thed.SelectSingleNode("tr").SelectSingleNode("th").InnerText
             '  Console.WriteLine(text)
             MsgBox(text)

             MsgBox(text)
             lstName.Add(text)
         Next


         For i As Integer = 0 To lstNode.Count - 1
             Dim dt As DataTable = New DataTable
             dt.TableName = lstName(i)
             For Each trNode As HtmlNode In lstNode(i).SelectNodes("tr")
                 If trNode.Attributes("class") Is Nothing Then
                     For Each colNode As HtmlNode In trNode.SelectNodes("td")
                         dt.Columns.Add(colNode.InnerText)
                     Next
                 Else
                     Dim j As Integer = 0
                     Dim row As DataRow = dt.NewRow()
                     For Each rowNode As HtmlNode In trNode.SelectNodes("td")
                         row(j) = rowNode.InnerText
                         j += 1
                     Next
                     dt.Rows.Add(row)
                     '  MsgBox("test")
                 End If
             Next
             lstTable.Add(dt)
             DataGridView1.DataSource = dt

         Next
     End Using

     MsgBox("final finished...")



 End Sub


dotnet-visual-basic
tabledata.jpg (42.3 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

XingyuZhao-MSFT avatar image
0 Votes"
XingyuZhao-MSFT answered XingyuZhao-MSFT edited

Hi @MalikAsadMahmood-8962 ,
You need to know how to get specific table with Html Agility Pack.
Take a look at the following code:

             ...
             Dim lstName As List(Of String) = New List(Of String)
             Dim dt As DataTable = New DataTable
    
             For Each table As HtmlNode In doc.DocumentNode.SelectNodes("//table[contains(@id,'earningsHistory')]")
                 For Each columnNameNode As HtmlNode In table.SelectNodes(".//tr/th")
                     Dim columnName As String = columnNameNode.InnerText
                     If columnName.Contains("&nbsp") Then
                         Dim name = columnName.Replace(" ", " ")
                         lstName(lstName.Count - 1) += name
                     Else
                         lstName.Add(columnName)
                     End If
                 Next
                 For Each colName As String In lstName
                     dt.Columns.Add(colName)
                 Next
                 Dim i As Integer
                 Dim row As DataRow = dt.NewRow()
                 For Each itemNode As HtmlNode In table.SelectNodes(".//tr/td")
    
                     If i = dt.Columns.Count Then
                         i = 0
                         dt.Rows.Add(row)
                         row = dt.NewRow()
                     End If
                     If itemNode.InnerText.Contains("&nbsp") Then
                         row(i) += itemNode.InnerText
                     Else
                         row(i) = itemNode.InnerText
                         i += 1
                     End If
                 Next
             Next
    
             DataGridView1.DataSource = dt

Result of my test:
80888-1.png


Best Regards,
Xingyu Zhao


If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.



1.png (73.5 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MalikAsadMahmood-8962 avatar image
0 Votes"
MalikAsadMahmood-8962 answered

thank you for reply, Please from following url https://www.investing.com/equities/wrldcal-teleco-earnings want to read EPS income table further htmltable snapshot attach80581-tabledata.jpged



tabledata.jpg (42.3 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MalikAsadMahmood-8962 avatar image
0 Votes"
MalikAsadMahmood-8962 answered XingyuZhao-MSFT commented

thank you for continuous support so kind its working fine,but when I am change url in order to get other product data for same page the htmltable name chance and class remains same for example following url in order to get data of same htmltable for different product its not working


https://www.investing.com/equities/taha-spinning-earnings

https://www.investing.com/equities/tri-star-poly-earnings
thank you

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @MalikAsadMahmood-8962,
Considering the table id always starts with 'earningsHistory', you can use XPath like :

     //table[contains(@id,'earningsHistory')]

I have updated my code.

0 Votes 0 ·