Software Engineer Problem Solver

Leo Reading

One Moment, Please...

Find My Articles

Share This Page

B l o g

Blog

Leo Reading

Software Engineer and Problem Solver From the USA

Download Resume

My Blog A collection of technical and non-technical mishaps, achievements and stories.

Programming

Dynamic Robots.Txt Generator in MVC5 Application

A robots.txt file is important because it tells search engine crawlers what they should and should not crawl.  Unfortunately, they don't always listen, but we should take advantage of the ones that do!  In today's world, websites aren't simply a collection of static pages:  There is ever-changing and new content being posted, and what's there one day may well be gone the next.

 

Before I begin, I should mention that a lot of this was inspired by Ben Foster's Fabrik.Common library, especially the SiteMap generator.  I took a lot of the same principles and applied them to a slightly different use case, and also abstracted most of it to a class library to avoid copying/pasting every time I need to publish a website.

 

Robots Txt Generator

The robots txt generator was made into a class library so it can be reused across many applications.  It has a robots generator that is responsbile for transforming a collection of robots items (model) into a string that is really our .txt file.  Finally, there's a robots result which extends the MVC action result to write the robots.txt string to the output stream.

In order to use this class, you simply need to include it as a reference in your MVC project, create a method that generates your collection of robots items, and then return the robots result.  For example, in the Home controller:


public ActionResult RobotsTxt()
{
    // Class that contains the methods to generate our robots items.
    BLL.SEOMethods SEOHelper = new BLL.SEOMethods();
    // Get our robots items
    List<RobotsTxtGenerator.RobotsItem> RobotsTxtItems = SEOHelper.GetRobotsTXTItems(this.Url);
    // Return the robots result
    return new RobotsTxtGenerator.RobotsResult(RobotsTxtItems);
} // end robots txt

 

Inside the SEOMethods, we have:


/// <summary>
/// Gets all of the robots items that will create our robots.txt file
/// </summary>
/// <param name="Url"></param>
/// <returns></returns>
public List<RobotsItem> GetRobotsTXTItems(UrlHelper Url)
{
    List<RobotsItem> AllRobotsItems = new List<RobotsItem>();

    // Get the pages we want to explicitly allow.
    AllRobotsItems.AddRange(GetAllowedRobotsTxtItems(Url));

    // Get the disallowed pages
    AllRobotsItems.AddRange(GetDisallowedItems(Url));

    // Get pages you only want Google to find
    AllRobotsItems.AddRange(GetGoogleOnlyPages(Url));

    return AllRobotsItems;
} // end get robots txt items

 

And I'll show one of the methods to give an idea of how to add these items.


/// <summary>
/// Gets a collection of robots items to be marked as allowed for all user agents.
/// </summary>
/// <param name="Url"></param>
/// <returns></returns>
private List<RobotsItem> GetAllowedRobotsTxtItems(UrlHelper Url)
{
    List<RobotsItem> AllowedPages = new List<RobotsItem>();
    AllowedPages.Add(new RobotsItem(Url.Action("Index", "Home"), RobotsItem.AccessTypeEnum.Allow, RobotsItem.UserAgentEnum.All));
    AllowedPages.Add(new RobotsItem(Url.Action("About", "Home"), RobotsItem.AccessTypeEnum.Allow, RobotsItem.UserAgentEnum.All));
    AllowedPages.Add(new RobotsItem(Url.Action("Contact", "Home"), RobotsItem.AccessTypeEnum.Allow, RobotsItem.UserAgentEnum.All));

    //// If you had a blog and wanted to add all of your blog posts, you could do something like the following:
    //IEnumerable<Int32> PostIds = _db.Posts.Select(x => x.Id);
    //foreach (Int32 PostId in PostIds)
    //{
    // AllowedPages.Add(new RobotsItem(Url.Action("ViewPost", "Blog", new { PostId = PostId }), RobotsItem.AccessTypeEnum.Allow, RobotsItem.UserAgentEnum.All));
    //}

    return AllowedPages;
} // end get allowed robots txt items

 

There's a couple of other things you need to do to get this working in an MVC site:

Configure the Robots.txt Route

In the App_Start folder, there's a RouteConfig.cs.  In there, you should add the following:

// For our dynamically created robots.txt
routes.MapRoute(
name: "robots.txt",
url: "robots.txt",
defaults: new { controller = "Home", action = "RobotsTxt" }
);

This tells the MVC application what to do when /robots.txt is requested.  Instead of attempting to serve a static file that doesn't exist, it will serve up the RobotsTxt action on our home controller.

Configure IIS to Serve the File

IIS may not play nicely with this one at first, but a simple change to the web.config and you're good to go! Add the following in the <system.webServer><handlers> section of your web.config. 


<system.webServer>
<handlers>
<!-- Tell IIS to serve our robots.txt file | There's also a route set up in the App_Start\RouteConfig.cs for this -->
<add name="Robots" path="robots.txt" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" />
</handlers>
</system.webServer>

 

That's all you need to get going!  You can get to the source via my Github Account.

Leo Reading

Leo Reading is a US based software engineer and problem solver. Known as a jack of all trades and master of few, Leo is constantly learning new technology and expanding his understanding of all things 'nerd'

You Might Also Like

Authorization Attributes - AuthorizeCore

Today I had an interesting issue with a custom authorization attribute, something I've written at least a dozen times in the last 6 months.  I typically lock down my MVC projects using a lot of the built in frameworks, but where I'm such a fanboy of dapper, I like to extend it a little more and add an enum to help me handle the role...

Delay In Development

I had put a decent amount of time and effort into creating this website, but unfortunately it is still seriously lacking in the content department.  While I've been meaning to update it, I've been spending every free moment I have working on a super top-secret project that I will be announcing soon.  If you look through my othe...

One Moment, Please

Close