Skip to content
Guides/F# Giraffe

How to Block AI Bots on F# Giraffe: Complete 2026 Guide

Giraffe is the functional F# web framework on ASP.NET Core. Handlers compose with >=> (the fish operator). To block an AI bot: call the 403-response pipeline with earlyReturn ctx — the next handler is never invoked. Place public routes (robots.txt, /health) before botBlocker in the choose list.

The >=> fish operator — how Giraffe composes handlers

h1 >=> h2 creates a new handler. When called, h1 receives h2 as its next function. If h1 calls next ctx, h2 runs. If h1 uses earlyReturn ctx, h2 never runs — the chain terminates.

type HttpHandler = HttpFunc -> HttpContext -> Task<HttpContext option>

Protection layers

1
robots.txtroute "/robots.txt" placed first in choose list — served before botBlocker; or UseStaticFiles() in wwwroot/
2
noai meta tagIn htmlView — <meta name="robots" content="noai, noimageai"> in <head>
3
X-Robots-Tag (blocked)setHttpHeader "X-Robots-Tag" "noai, noimageai" in the 403 pipeline before earlyReturn ctx
4
X-Robots-Tag (legitimate)setHttpHeader "X-Robots-Tag" "noai, noimageai" >=> next in the pass-through branch
5
Hard 403(setStatusCode 403 >=> text "Forbidden") earlyReturn ctx — next never called

Step 1 — Bot detection (AiBots.fs)

String.IsNullOrEmpty guard before any comparison. StringComparison.OrdinalIgnoreCase in Contains — no manual lowercasing needed. List.exists short-circuits on first match.

// AiBots.fs — bot detection module

module AiBots

open System

// Known AI bot UA substrings — lowercase for comparison.
let private patterns = [
    // OpenAI
    "gptbot"; "chatgpt-user"; "oai-searchbot"
    // Anthropic
    "claudebot"; "claude-web"
    // Common Crawl
    "ccbot"
    // Bytedance
    "bytespider"
    // Meta
    "meta-externalagent"
    // Perplexity
    "perplexitybot"
    // Google AI
    "google-extended"; "googleother"
    // Cohere
    "cohere-ai"
    // Amazon
    "amazonbot"
    // Diffbot
    "diffbot"
    // AI2
    "ai2bot"
    // DeepSeek
    "deepseekbot"
    // Mistral
    "mistralai-user"
    // xAI
    "xai-bot"
    // You.com
    "youbot"
    // DuckDuckGo AI
    "duckassistbot"
]

/// isAiBot: returns true if ua contains any known AI bot pattern.
/// Case-insensitive — uses StringComparison.OrdinalIgnoreCase.
let isAiBot (ua: string) =
    if String.IsNullOrEmpty ua then false
    else
        patterns |> List.exists (fun p ->
            ua.Contains(p, StringComparison.OrdinalIgnoreCase))

Step 2 — Bot-blocker handler (BotBlocker.fs)

ctx.Request.Headers["User-Agent"].ToString() StringValues.ToString() returns "" (not null) when the header is absent. No null check needed. The blocked branch calls the response pipeline with earlyReturn ctx — not with next ctx.

// BotBlocker.fs — Giraffe HttpHandler

module BotBlocker

open System
open Microsoft.AspNetCore.Http
open Giraffe
open AiBots

// HttpHandler type: HttpFunc -> HttpContext -> Task<HttpContext option>
// HttpFunc type:    HttpContext -> Task<HttpContext option>
//
// >=> (fish operator): composes two HttpHandlers.
// earlyReturn: terminal HttpFunc that returns Some ctx immediately.
//
// To short-circuit: call handler pipeline with earlyReturn ctx
//   — next (the inner app) is never called.
// To pass through: call next ctx after optionally modifying context.
let botBlocker : HttpHandler =
    fun next ctx ->
        // ctx.Request.Headers["User-Agent"] returns StringValues.
        // .ToString() returns "" if the header is absent — never null.
        let ua = ctx.Request.Headers["User-Agent"].ToString()

        if isAiBot ua then
            // Short-circuit: compose 403 response handlers and call with earlyReturn.
            // earlyReturn is the terminal HttpFunc — chain stops here.
            // next is never called — no downstream handlers run.
            (       setStatusCode 403
                >=> setHttpHeader "X-Robots-Tag" "noai, noimageai"
                >=> setHttpHeader "Content-Type" "text/plain; charset=utf-8"
                >=> text "Forbidden"
            ) earlyReturn ctx
        else
            // Pass through: add X-Robots-Tag then call next.
            (setHttpHeader "X-Robots-Tag" "noai, noimageai" >=> next) next ctx

Step 3 — Router with choose

choose tries each handler in list order. The first that doesn't return None wins. robots.txt and /health are listed before botBlocker — they're served to all crawlers without any bot check.

// App.fs — Giraffe router with bot-blocker middleware

module App

open Giraffe

let private robotsTxt = """User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /
"""

// choose tries each handler in order.
// The first handler that does NOT return None wins.
// Place robots.txt and /health BEFORE botBlocker so they are
// served to all crawlers without hitting the bot check.
let webApp : HttpHandler =
    choose [
        // Public routes — bypasses botBlocker
        route "/robots.txt" >=>
            setHttpHeader "Content-Type" "text/plain; charset=utf-8" >=>
            text robotsTxt

        route "/health" >=> text "ok"

        // Protected routes — botBlocker applied to all of these
        BotBlocker.botBlocker >=>
        choose [
            route "/"           >=> htmlView Views.homeView
            route "/api/data"   >=> json {| data = "protected" |}
            setStatusCode 404   >=> text "Not Found"
        ]
    ]

Step 4 — Host setup (Program.fs)

UseStaticFiles() before UseGiraffe — if robots.txt is in wwwroot/, ASP.NET Core serves it without entering the Giraffe pipeline at all.

// Program.fs — ASP.NET Core host with Giraffe

module Program

open Microsoft.AspNetCore.Builder
open Microsoft.AspNetCore.Hosting
open Microsoft.Extensions.Hosting
open Microsoft.Extensions.DependencyInjection
open Giraffe
open App

[<EntryPoint>]
let main argv =
    Host.CreateDefaultBuilder(argv)
        .ConfigureWebHostDefaults(fun webHost ->
            webHost
                .ConfigureServices(fun services ->
                    // Register Giraffe services (required)
                    services.AddGiraffe() |> ignore
                )
                .Configure(fun app ->
                    // UseStaticFiles serves wwwroot/ — including robots.txt if placed there.
                    // Runs BEFORE Giraffe, so /robots.txt is handled without entering the Giraffe pipeline.
                    app.UseStaticFiles() |> ignore

                    // UseGiraffe wires the Giraffe app as the terminal middleware.
                    app.UseGiraffe(webApp)
                )
                |> ignore
        )
        .Build()
        .Run()
    0

Saturn and Falco variants

Saturn is built on Giraffe — the same HttpHandler type and >=> composition work unchanged. Falco uses a different handler model but the same HttpContext access pattern.

// Saturn variant — Saturn is built on Giraffe.
// Giraffe HttpHandlers (including botBlocker) work unchanged in Saturn.

open Saturn
open App
open BotBlocker

// Saturn router — uses a computation expression
let apiRouter = router {
    get "/data" (json {| data = "protected" |})
}

// Apply botBlocker as a pipeline before the Saturn router.
// Saturn's pipe_through accepts HttpHandler values.
let appRouter = router {
    get   "/"           (htmlView Views.homeView)
    get   "/robots.txt" (text robotsTxt)
    get   "/health"     (text "ok")
    // Forward /api/* through botBlocker first
    forward "/api" (botBlocker >=> apiRouter)
}

let app = application {
    use_router appRouter
    use_static "wwwroot"
}

run app

// Falco variant — different HttpHandler model
// open Falco
//
// let botBlockerFalco : HttpHandler =
//     fun ctx ->
//         let ua = ctx.Request.Headers["User-Agent"].ToString()
//         if AiBots.isAiBot ua then
//             Response.withStatusCode 403
//             >> Response.ofPlainText "Forbidden"
//             <| ctx
//         else
//             ctx.Response.Headers.Append("X-Robots-Tag", "noai, noimageai")
//             // call next handler — Falco uses different composition
//             Task.CompletedTask

F# Giraffe vs C# ASP.NET Core vs Falco vs Saturn

FeatureF# GiraffeC# ASP.NET CoreF# FalcoF# Saturn
Handler typeHttpHandler = HttpFunc -> HttpContext -> Task<HttpContext option>RequestDelegate = HttpContext -> Task (imperative, no return value)HttpHandler = HttpContext -> Task<unit> — simpler, no option wrapperSame as Giraffe — Saturn is built on top of Giraffe
Short-circuit(setStatusCode 403 >=> text "Forbidden") earlyReturn ctx — next never calledctx.Response.StatusCode = 403; await ctx.Response.WriteAsync(); return — do NOT call next()Response.withStatusCode 403 >> Response.ofPlainText "Forbidden" — returns Task<unit>Same as Giraffe earlyReturn — Saturn uses Giraffe under the hood
>=> compositionh1 >=> h2 — Kleisli composition; h2 only runs if h1 calls nextapp.Use(async (ctx, next) => { ... await next(ctx); ... }) — imperativeNo >=> — Falco uses |> piping and explicit compositionSame >=> as Giraffe — Saturn adds router { } CE on top
UA header accessctx.Request.Headers["User-Agent"].ToString() — StringValues, .ToString() safectx.Request.Headers["User-Agent"].ToString() — identical StringValues accessctx.Request.Headers["User-Agent"].ToString() — same HttpContextSame as Giraffe — same HttpContext
robots.txtroute "/robots.txt" before botBlocker in choose [...] list, or UseStaticFiles()app.UseStaticFiles() in Configure before app.UseMiddleware<BotBlocker>() — wwwroot/robots.txtget "/robots.txt" handler before botBlockerFalco in router, or UseStaticFiles()use_static "wwwroot" in application CE — or route before botBlocker in router
Choose / routingchoose [r1; r2; r3] — tries handlers in order, first non-None winsapp.MapGet("/", ...) endpoint routing — pattern matching on pathRouter.get "/" handler — similar route matching, different APIrouter { get "/" handler } computation expression — wraps Giraffe routing

Summary

  • earlyReturn ctx short-circuits — call the 403 pipeline with earlyReturn instead of next. The chain terminates; no downstream handlers run.
  • >=> composes left-to-right h1 >=> h2 runs h1 first; h2 only if h1 calls next. Build your pipeline like a data pipeline.
  • StringValues.ToString() — always safe, never null. Returns "" when the header is absent.
  • choose list order matters — robots.txt and public routes before botBlocker. First match wins.
  • Saturn uses the same HttpHandler — Giraffe middleware works unchanged in Saturn apps. No port needed.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.