• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to check URL validity

Good morning.

I have a problem to solve. I have many urls I would like to test for.

I would need a macro to check every url of the column "a", checking if the internet page still exists (in this case the macro will insert "ok" in the corresponding cell in column "b" or "no" if the page does not exist.

If there is a page redirection, in this case I would like to write "redrict" in column "b" and insert the new url into the corresponding cell in column "d".

Regarding redirection, I would like to take into account only the differences that exist before the last "/" of the column "a" url:

Example line 4:

The url is different in the beginning "http: // www." Compared to the new url that starts only with "http: //", then it is a redirection.

You will see the diversity of lines 9-10-11

It's important to keep in mind that it's not a redirection, when what's changing is just the part that is after the last "/" of the column "a" url:

example:

Http://www.poliba.it/ e http://www.poliba.it/xxx

Lastly, in the redirect column I would like the link to be reported indicating the first degree domain, not even the part that is after "/" after the extension.

Thank you
 

Attachments

  • url.xlsx
    8.4 KB · Views: 36
Hi !

As there are many samples on Web …

So the faster way is to use a request with appropriate URL syntax
but it is not the case in row #14 of your attachment !

So if you know exactly what is a valid URL just mod
a request code to take care of this case …
If not, as a beginner and with this kind of URL, pilot Internet Explorer
like you can see in tutorials over the Web …

Also check exact result with the one of this row #14 …

As a reminder, the purpose of a private conversation
is not to answer to any forum thread …
 
As I do not know how to produce wrong result
so first check row #14 result as it can't be a redirection
if you ever know what is an URL, a website address, a webbrowser, …

And as they are some yet ready VBA code on web for this purpose !
 
Hi Mark
Since you've insisted I've been searching on the web for some similar macro.
I found these three links, which operate differently.
By combining them together I could get what I want, however, this is something I could not do, if not in 5 years if I get to study how to create a macro :)


https://www.searchenginepeople.com/blog/how-to-check-http-redirects-with-microsoft-excel.html

https://social.msdn.microsoft.com/F...o-check-and-see-if-they-exists?forum=exceldev

https://www.mrexcel.com/forum/excel-questions/632198-check-if-url-valid.html

Would you help me combine them together?
 
Only 3 ?‼

And for row #14 ?! …
'cause a request code can't return same result as yours
so I do not want to waste any second …

And no need to combine anything ! First just test them for their purpose …
 
Good morning.

I attach the file with the corresponding macros and site.

The first link (file 1) seems the best. Within this link, I seem to understand that it is written that if the link does not indicate the protocol (http: // or https: //), this is added to the macro. How is this not happening?

I would love the macro of file "1", extended to more links. At most I find an internet program that adds the protocol. If you think I've checked out 15,000 links by hand, this macro would be so much to me. :)

See you soon
 

Attachments

  • url 1.xls
    41 KB · Views: 31
  • url 2.xlsx
    10.9 KB · Views: 14
  • url 3.xlsx
    11.3 KB · Views: 16
As on my side row #14 can not be a redirection whatever the tool used
so I can't reproduce same wrong result …

The only way to have same wrong result is to pilot Internet Explorer but
it is also the slowest way which should need more than 110 seconds
to check initial sample file as a request code may take less than 55 seconds
(with a maximum of 20 codelines for those reading WinHttpRequest doc !)
and for those with VBA & VBScript skills parallel requests need around 20s
down to 15s but as times depend on quality of network connection …

This is a common subject on web with a lot of yet ready VBA request code …

Parallel requesting just needs an adaptation of a classic request code,
there are at least four examples on web,
two I have posted including one on Chandoo …

And - again :eek: - do not ask anything by any private conversation ‼
 

ThrottleWorks, Shahin, when you manually valid row #14 URL
whatever the webbrowser, what is the exact URL (or domain)
once the page finishes to be loaded ? Thanks.
 
I give up for this post :(
Hi ,

I am sure there are many other members in this forum who can help you ; I can immediately think of Chihiro.

Why don't you PM him and see what his response is ?

I would personally love to help you , but I am tied up at present. If you are willing to wait , I can certainly help you out.

You might also want to check out this member , who is a Microsoft manager , who is well-versed in VBA and automation - Fábio Negreiros.

Narayan
 
Thanks !

Like on my side, you don't get any redirection …
So I would like to understand how stephano can have a redirection ?‼
Except if it's just a typo in his worksheet he does not want to admit
- I clearly asked him several times to check it out ! -
or maybe he does not know what is an URL …
 
Hi Marc ,

He might be meaning something else by redirection.

The two links whose opening in the browser results in links that are identical to the input URL are :

http://www.technion.ac.il/

http://www.ut.ac.id/

In the other cases , the input URL is something , while the resulting link after the page has been opened in the browser is slightly different from the input URL. Probably this is redirection according to him.

Narayan
 
Thanks NARAYANK991. I'm a person who can wait. It will be a pleasure for me to receive your help.

Per mark:
In the first post I wrote explicitly:

It's important to keep in mind that it's not a redirection, when what's changing is just the part that is after the last "/" of the column "a" url:

example:

Http://www.poliba.it/ and http://www.poliba.it/xxx

Lastly, in the redirect column I would like to link to be reported indicating the first degree domain, not even the part that is after "/" after the extension.

I do not know if this check can be done with a macro at the same time as the search goes on. Obviously if you do not consider it a redirection (then I'll see what changes will actually be or I will insert a proper formula).

Thanks for the attention
 

Again you do not ever pay attention ! :rolleyes:

Last chance : check your result for row #14 ! (Direct link on post #15)
Like we do not have same result as yours as it is very difficult
to get a redirection result when no redirection exists ‼

And it's not only my point of view but those from books like from
pro users, pro network softwares and like VBA samples from web !

The VBA code from Leith Ross (post #8) is ready to use, just try it !
 
Hi Marc ,

He might be meaning something else by redirection.

The two links whose opening in the browser results in links that are identical to the input URL are :

http://www.technion.ac.il/

http://www.ut.ac.id/

In the other cases , the input URL is something , while the resulting link after the page has been opened in the browser is slightly different from the input URL. Probably this is redirection according to him.

Narayan


Well you understand ....
I have urls and this check is useful to see if the inside of the url has changed or not. That's why I'll have to extract emails for every url later and I'll have to organize them.
I'll explain it to you with the links you've got.
I have this link:
http://www.technion.ac.il/
The emails of this site will have the same path:

xxxx@technion.ac.il
zzzz@technion.ac.il
yyyy@technion.ac.il

If I know that the link is different as in this case:

http://www.ut.ac.id/

The emails will also be different, so the path will no longer be "technion.ac.il" but trusted "ut.ac.id"

So emails will become:

xxxx@ut.ac.id
zzzz@ut.ac.id
yyyy@ut.ac.id

You're bright
 
Again you do not ever pay attention ! :rolleyes:

Last chance : check your result for row #14 ! (Direct link on post #15)
As we do not have same result as yours …

As it is very difficult to get a redirection result when no redirection exists ‼
And it's not only my point of view but those from books like from
pro users, pro network softwares and like VBA samples from web !

The VBA code from Leith Ross you had post is ready to use, just try it !


Try putting a non-protocol address on google then copying and pasting the address to word. You will see that you also copy the protocol. Then google locates the exact link.
As I have just written, I could rule out cases that do not start with the protocol ...
 

Google ‼‼ :eek: :confused: :DD :rolleyes: o_O
The reason why you have difficulties to find a way !

Read post #13, so try and write back what is exactly your result !
 
Back
Top