• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Finding an image and its title

Mateusz

New Member
Hey, this macro should:

load URL which written + word from A2 + word from Z2 what gives us a resualt of searching in google images sth like "32925001 ceneo"
then it should find a title of this website
and put it into the right cell.

But in MsgBox I recive "Error 403 Forbidden!!!"1

What shall I do?


Code:
 Sub Znajdz_grafike()
Dim s As String
Dim title As String
    Dim objHttp As Object
  Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
  If Right(s, 3) <> "htm" And Right(s, 4) <> "html" Then
If Right(s, 1) <> "/" Then s = s & "/"
End If
    objHttp.Open "GET", "http://www.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=" & Worksheets("Arkusz6").Range("A2").Text & Worksheets("Arkusz6").Range("Z2").Text, False
    objHttp.Send ""
    title = objHttp.ResponseText

    If InStr(1, UCase(title), "<TITLE>") Then
      title = Mid(title, InStr(1, UCase(title), "<TITLE>") + Len("<TITLE>"))
      title = Mid(title, 1, InStr(1, UCase(title), "</TITLE>") - 1)
    Else
      title = ""
    End If

    MsgBox title
    Range("X2").Value = title
 
 
End Sub
 
Hi, Mateusz!
Very funny... but without the A2 and Z2 cell values how are we supposed to even analyze the error?
Regards!
 
Hi, Mateusz!

A few considerations about the posted code:

a) This does nothing.
Code:
  If Right(s, 3) <> "htm" And Right(s, 4) <> "html" Then
If Right(s, 1) <> "/" Then s = s & "/"
End If
As the first executable statements after object creation, the variable s is empty.

b) This works fine.
Code:
    objHttp.Open "GET", "http://www.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=" & Worksheets("Hoja1").Range("A2").Text & Worksheets("Hoja1").Range("Z2").Text, False
    objHttp.Send ""
    title = objHttp.ResponseText
And fills variable title with all the HTML code.

c) This fails.
Code:
    If InStr(1, UCase(title), "<TITLE>") Then
      title = Mid(title, InStr(1, UCase(title), "<TITLE>") + Len("<TITLE>"))
      title = Mid(title, 1, InStr(1, UCase(title), "</TITLE>") - 1)
    Else
      title = ""
    End If
Try debugging step by step with F8, and you'll find that variable title is correctly updated as per the logic you're using. It's a very long string to debug manually so I recommend you to test by parts.

Hope it helps.

Regards!
 
Well, in Range("X2").Value = title
it fails...
But it doesn't change a fact that I still dont know what to do with it
 
Hi, Mateusz!

The problem is that you're parsing wrongly the originally value stored in variable title. Before the If...EndIf structure the variable contains the whole (and up to then still correct) value, so check the arguments used in each of the 2 Mid instructions. As I wrote before:
Try debugging step by step with F8, and you'll find that variable title is correctly updated as per the logic you're using. It's a very long string to debug manually so I recommend you to test by parts.

About don't working with X2 cell value, in my case it works perfectly as I place the procedure in the class module of the related worksheet. If in yours it doesn't work I assume that you placed it in a standard module, if so then use a worksheet qualifier as you did with A2 and Z2 cells.

Regarding this:
But it doesn't change a fact that I still dont know what to do with it
just follow up the previous and above guidelines.

Regards!
 
I checked

Code:
    objHttp.Open "GET", "http://www.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=" & Worksheets("Arkusz6").Range("A2").Text & Worksheets("Arkusz6").Range("F2").Text, False

IS NOT WORKING


the address with the same words which where in A2 and F2 but I searched them manually in google and copied the URL:
Code:
objHttp.Open "GET", "http://www.google.pl/search?q=32925001+ceneo&espv=210&es_sm=122&source=lnms&tbm=isch&sa=X&ei=qBIaU8L9AaaR7Aaz6YCwCQ&ved=0CAkQ_AUoAQ&biw=1920&bih=931", False

IS NOT WORKING

If I change to

Code:
objHttp.Open "GET", "http://www.google.com", False

The title= Google

SO IT IS WORKING

the question: What is wrong with the resualts os searching page?
Why does it have "Forbiden 403" ... ?
 
Hi, Mateusz!

You checked wrong.

I checked

Code:
    objHttp.Open "GET", "http://www.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=" & Worksheets("Arkusz6").Range("A2").Text & Worksheets("Arkusz6").Range("F2").Text, False

IS NOT WORKING
It's working, check cell X4.

the address with the same words which where in A2 and F2 but I searched them manually in google and copied the URL:
Code:
objHttp.Open "GET", "http://www.google.pl/search?q=32925001+ceneo&espv=210&es_sm=122&source=lnms&tbm=isch&sa=X&ei=qBIaU8L9AaaR7Aaz6YCwCQ&ved=0CAkQ_AUoAQ&biw=1920&bih=931", False

IS NOT WORKING
It's working, check cell X4.

I added a few lines for making your debugging easier and this is the tested code:
Code:
Option Explicit

Sub Znajdz_grafike()
Dim s As String
Dim title As String
    Dim objHttp As Object
  Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
  If Right(s, 3) <> "htm" And Right(s, 4) <> "html" Then
If Right(s, 1) <> "/" Then s = s & "/"
End If
    objHttp.Open "GET", "http://www.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=" & Worksheets("Arkusz6").Range("A2").Text & Worksheets("Arkusz6").Range("Z2").Text, False
    objHttp.Send ""
    title = objHttp.ResponseText
[X4].Value = title
    If InStr(1, UCase(title), "<TITLE>") Then
[X5].Value = InStr(1, UCase(title), "<TITLE>") + Len("<TITLE>")
      title = Mid(title, InStr(1, UCase(title), "<TITLE>") + Len("<TITLE>"))
[X6].Value = title
[X7].Value = InStr(1, UCase(title), "</TITLE>") - 1
      title = Mid(title, 1, InStr(1, UCase(title), "</TITLE>") - 1)
[X8].Value = title
    Else
      title = ""
    End If

    MsgBox title
    Range("X2").Value = title
End Sub

As you can see the problem is the same that I posted earlier:
The problem is that you're parsing wrongly the originally value stored in variable title. Before the If...EndIf structure the variable contains the whole (and up to then still correct) value, so check the arguments used in each of the 2 Mid instructions.
You're wrongly parsing the retrieve text into the variable title. And I won't analyze why, I leave that for you: you have the code there, the debug data that you say you checked but it seems that you didn't do it carefully, so count manually in which position of the original text is the data you want to extract, count manually its length, and then compare those figures with the numbers that you use in your Mid functions.

Sorry for not being able of further help.

Regards!
 

Attachments

Ok I handled it in a different way- generally it should download the URL of image so I chenged it- instead of searching google it will search nokaut.pl - there are all products actually...

So here we have:


Code:
 Sub Znajdz_grafike()
Dim s As String
Dim title As String
    Dim objHttp As Object
  Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
  If Right(s, 3) <> "htm" And Right(s, 4) <> "html" Then
If Right(s, 1) <> "/" Then s = s & "/"
End If
    objHttp.Open "GET", "http://www.nokaut.pl/ravak+X070002", False
    objHttp.send ""
    title = objHttp.responseText

    If InStr(1, UCase(title), "<TITLE>") Then
      title = Mid(title, InStr(1, UCase(title), "<TITLE>") + Len("<TITLE>"))
      title = Mid(title, 1, InStr(1, UCase(title), "</TITLE>") - 1)
    Else
      title = ""
    End If


    Range("A5").Value = title

End Sub


BUT Instead of "<TITLE>" I would like it to find a "Image " content=""

But unfortunetly I have no idea how to put Image " content=" into " "

VBA freaks out then :/

This is how it should works (but beacuse of " " in HTML document it doesnt):

Code:
    If InStr(1, UCase(title), ""Image " content="") Then
      title = Mid(title, InStr(1, UCase(title), ""Image " content="") + Len(""Image " content=""))
      title = Mid(title, 1, InStr(1, UCase(title), " /><meta") - 1)

So how to let VBA understand that "Chars" which he needs to look for are " "" "

?
 
Hi, Mateusz!

The problem is quite another than you describe.

URLs, 1) build in the code, 2) IE 9 (years I didn't use it), 3) Firefox 27, 4) Chrome 33, 5) build in the code (new), 6) redirected by the site -Chrome version- (new)

1) http://www.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=32925001 Ceneo
2) https://www.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=32925001 Ceneo
3) https://encrypted.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=32925001 Ceneo
4) https://encrypted.google.com/search?site=imghp&tbm=isch&source=hp&biw=1920&bih=858&q=32925001 Ceneo
5) http://www.nokaut.pl/ravak X070002
6) http://www.nokaut.pl/baterie-umywalkowe/bateria-ravak-suzan-x070002.html

As you can see, in the original case (1-4) despite the difference between browsers the target page is the same: a Google image search. But in the new case (5-6) you're accessing a completely different web page, which besides redirects you.

I don't know why did you think that you could obtain that from Google search with the originally posted code (it's you who should, since you provided the code), but in both cases the code works fine: what you're getting is what you're asking for, i.e., what's between the title tags: <title> and </title>.

In the 1-4 cases: <title>Error 403 (Forbidden)!!1</title>
In the 5-6 cases: <title>Bateria Ravak SUZAN X070002</title>

Buy however you should first check manually (and correctly as I explained above and not as you insisted that you did) what is the content retrieved from the URL and stored in the variable title. Then adjust the URL properly to fit your needs, but what you can't do is try to find a battery where is forbidden.

Regards!
 
Thanks for everything, now it works, with title for example it worked; but the following doesn't work; No Idea why:


Code:
Sub Znajdz_grafike()
Dim s As String
Dim title As String
    Dim objHttp As Object
  Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
  If Right(s, 3) <> "htm" And Right(s, 4) <> "html" Then
If Right(s, 1) <> "/" Then s = s & "/"
End If
    objHttp.Open "GET", "http://www.niebieskalazienka.pl/towar/szukaj/" & Worksheets("Arkusz6").Range("A2").Text, False
    objHttp.send ""
    title = objHttp.responseText



      If InStr(1, UCase(title), "CENTER""><IMG SRC=""") Then
      title = Mid(title, InStr(1, UCase(title), "CENTER""><IMG SRC=""") + Len("CENTER""><IMG SRC="""))
      title = Mid(title, 1, InStr(1, UCase(title), " alt=") - 1)
    Else
      title = ""
    End If


    Range("X2").Value = title

End Sub

Here is the link to the page I am testing it on

http://www.niebieskalazienka.pl/towar/szukaj/A801470004
 
Hi, Mateusz!
I don't have a damn clue to explain why it doesn't work. I just know that it should work with this link:
http://www.niebieskalazienka.pl/tow...stojaca-z-korkiem-cubic-deante-bdd-021m-chrom
I didn't test it, but the pages you're crawling into seem to work when the URL is of an individual itemand not when of a category or group of them. So you yet know the method, just try it by yourself analyzing each responseText retrieved text.
Regards!
PS: As a tip (for free and last in this topic) there's no standard method that you could apply to "steal" (or get or use the verb you want) data from every web page. Not even general data as title, not even in the different pages of a website, not even... never. Each page corresponding to certain design might or might not expose its data in the same way. So once you proved your method with one and it works, if it doesn't work for another one, then dump the original first content into the title variable and play there. Over.
 
Back
Top