Motivation

Before I started with this tutorial, there are lots of article around the topic “Customize Unreal Engine’s Rendering Pipeline”. Those articles were intrusive or just getting deep into the Geometry Pipeline(or Mesh Pipeline) like VertexFactory(a stage using to collect MeshBatch in the Geometry Pipeline).

In this tutorial, I will expound a method to draw your custom pass by creating a plugin. BTW, you should notice that there are still many limitation if you don’t want to modify the engine source.

We are going use the Unreal Engine 5.3, check the Unreal Engine’s commit logs if something don’t work.

Preparation

Create your plugin by clicking the “Editor -> Edit -> Plugins -> Add” button. Add these dependencies into YourModule.Build.cs.

PrivateDependencyModuleNames.AddRange(
    new string[]
    {
        "Renderer",
        "RenderCore",
        "Projects",
        "RHI",
        "RHICore",
    }
);

Introduce ViewExtension

There is an interface named ISceneViewExtension. It is introduced in Unreal Engine 4.27 and more hooks in later releases. The interface derived some classes not tracked by reflection system. So you need to register them on your own.

Lets create our view extension class:

// OurViewExtension.h
// ... Ignored includes
class FOurViewExtension : public FWorldSceneViewExtension
{
public:
    FOurViewExtension(const FAutoRegister& AutoReg, UWorld* InWorld) : FWorldSceneViewExtension(AutoReg, InWorld) {}
    
    // +Impl interface ISceneViewExtension
    virtual void SetupViewFamily(FSceneViewFamily& InViewFamily);
    virtual void SetupView(FSceneViewFamily& InViewFamily, FSceneView& InView);
    virtual void BeginRenderViewFamily(FSceneViewFamily& InViewFamily);
    // -Impl interface ISceneViewExtension
};

// OurViewExtension.cpp
// Fill it with empty implementations

To ensure FOurViewExtension isn’t an abstract class, we need to provide the implementation of SetupViewFamily, SetupView and BeginRenderViewFamily here. But we leave it empty for now. The constructor is needed as its registration mechanism.

There are many dynamic dispatched callbacks (or virtual functions) you can use to hook into the rendering pipeline. See the figure below.

As we know the dynamic dispatch required to visit vtable to get the address of target function so we keep the extension per world to prevent more overhead. Lets create a class derived from UWorldSubsystem to manage it.

// OurWorldSubsystem.h
#pragma once
// ... Ignored some includes
#include "OurWorldSubsystem.generated.h" // Required this line to use UE reflection system.

// Add MODULENAME_API to class definition or the function definition if you want to use them outside this module.

UCLASS()
class UOurWorldSubsystem : public UWorldSubsystem
{
    GENERATED_BODY()
    
public:
    // +Impl interface USubsystem
    virtual void Initialize(FSubsystemCollectionBase& Collection) override;
    virtual void Deinitialize() override;
    // -Impl interface USubsystem
protected:
    TSharedPtr<FOurViewExtension> ViewExtension;
};

// OurWorldSubsystem.cpp
void UOurWorldSubsystem::Initialize(FSubsystemCollectionBase& Collection)
{
	UWorld* World = GetWorld();
	check(nullptr != World);

	ViewExtension = FSceneViewExtensions::NewExtension<FOurViewExtension>(World);
}

void UOurWorldSubsystem::Deinitialize()
{
	ViewExtension.Reset();
}

The view extension instance you registered will be stored in an array typed TArray< TWeakPtr<ISceneViewExtension, ESPMode::ThreadSafe> > KnownExtensions;. When the ViewExtension.Reset() called in Deinitialize() and you don’t keeping any other copy of the shared pointer, the reference counter will down to zero and then the real instance of view extension will be destroyed. The weak pointer in the KnownExtensions will be invalid. So we can unregister it safely by reset the shared pointer.

Now we have the ability to hook into the rendering pipeline (limited). We will get our first shader ready in the next section.

Shader Creation

You need to add a shader find path (or shader directory mapping, virtual shader path) before Engine started. Locating the LoadingPhase of your working module in the YourPlugin.uplugin and set it to "PostConfigInit".

Getting into your FYourModule::StartupModule() and add these codes:

void FYourModule::StartupModule()
{
	IPluginManager& PluginManager = IPluginManager::Get();
	const TSharedPtr<IPlugin> Plugin = PluginManager.FindPlugin("YourPluginName");

	check(Plugin.IsValid());

	FString PluginDirectory = Plugin->GetBaseDir();
	const FString ModuleShaderDir = FPaths::Combine(PluginDirectory, TEXT("Shaders"));
	AddShaderSourceDirectoryMapping(TEXT("/MyShader"), ModuleShaderDir);
}

I will show you the other codes here, there are easy to understand. We named the shader parameter type to FParameters in class namespace as the Unreal Engine Coding Standard said.

// OurShader.h
class FOurFirstVS : public FGlobalShader
{
    DECLARE_GLOBAL_SHADER(FOurFirstVS);
    SHADER_USE_PARAMETER_STRUCT(FOurFirstVS, FGlobalShader);
    
    BEGIN_SHADER_PARAMETER_STRUCT(FParameters, )
        SHADER_PARAMETER(uint32, TestParam)
    END_SHADER_PARAMETER_STRUCT()
        
    // Optional: Modify the shader complitation env. This is helpful to change the Shading Path by platform.
    static FORCEINLINE void ModifyCompilationEnvironment(const FGlobalShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& Environment) {}
};

// OurShader.cpp
IMPLEMENT_GLOBAL_SHADER(FOurFirstVS, "/MyShader/Private/FirstShader.usf", "MainVS", SF_Vertex);

Also, you can use the pattern below if you want to share a same parameter across different shader.

    
BEGIN_SHADER_PARAMETER_STRUCT(FOurFirstShaderParameters, )
    SHADER_PARAMETER(uint32, TestParam)
END_SHADER_PARAMETER_STRUCT()

// OurShader.h
class FOurFirstVS : public FGlobalShader
{
    DECLARE_GLOBAL_SHADER(FOurFirstVS);
    using FParameters = FOurFirstShaderParameters;
    SHADER_USE_PARAMETER_STRUCT(FOurFirstVS, FGlobalShader);
};

class FOurFirstPS : public FGlobalShader
{
    DECLARE_GLOBAL_SHADER(FOurFirstPS);
    using FParameters = FOurFirstShaderParameters;
    SHADER_USE_PARAMETER_STRUCT(FOurFirstPS, FGlobalShader);
};

// OurShader.cpp
IMPLEMENT_GLOBAL_SHADER(FOurFirstVS, "/MyShader/Private/FirstShader.usf", "MainVS", SF_Vertex);
IMPLEMENT_GLOBAL_SHADER(FOurFirstPS, "/MyShader/Private/FirstShader.usf", "MainPS", SF_Pixel);

You can now create a directory named Shaders/Private and create a file named FirstShader.usf.

// FirstShader.usf
#include "/Engine/Public/Platform.ush"

void MainVS(
    out float4 OutPosition : SV_POSITION
)
{
    OutPosition = float4(0, 0, 0, 1);
}

void MainPS(
    in float4 InPosition : SV_POSITION
) 
{
}

Make sure the input and output are correctly defined. The graphics card driver will try to link all shader set in a same rasterized rendering pipeline together. I emphasize rasterized rendering pipeline because it is the hardware pipeline we are working with. We need to set the pipeline state to the graphics card to make it work.

Draw with our first shader

Add an override of void PostRenderBasePassDeferred_RenderThread(FRDGBuilder& GraphBuilder, FSceneView& InView, const FRenderTargetBindingSlots& RenderTargets, TRDGUniformBufferRef<FSceneTextureUniformParameters> SceneTextures) to your view extension.

We need to draw on multiple render targets because we are processing deferred rendering. Lets changes the shader parameter structure to bind the data and buffers required.

BEGIN_SHADER_PARAMETER_STRUCT(FOurFirstShaderParameters, )
    SHADER_PARAMETER_STRUCT_REF(FViewUniformShaderParameters, View)
    SHADER_PARAMETER_RDG_UNIFORM_BUFFER(FSceneTextureUniformParameters, SceneTextures)
    RENDER_TARGET_BINDING_SLOTS()
END_SHADER_PARAMETER_STRUCT()

In the PostRenderBasePassDeferred_RenderThread function:

void FOurViewExtension::PostRenderBasePassDeferred_RenderThread(FRDGBuilder& GraphBuilder, FSceneView& InView,
	const FRenderTargetBindingSlots& RenderTargets, TRDGUniformBufferRef<FSceneTextureUniformParameters> SceneTextures)
{
    if (!InView.ShouldRenderView() || !InView.bIsViewInfo) return;
    RDG_EVENT_SCOPE(GraphBuilder, "OurPass"); // Set name of this scoped. Can be captured by graphics debugger like RenderDoc.
    
    // Get ShaderMap
    FViewInfo& ViewInfo = static_cast<FViewInfo&>(InView);
    FGlobalShaderMap* GlobalShaderMap = ViewInfo.ShaderMap;
    
    // Get Shaders
    TShaderMapRef<FOurFirstVS> VertexShader(GlobalShaderMap);
    TShaderMapRef<FOurFirstPS> PixelShader(GlobalShaderMap);
    
    // Allocate Shader Parameters.
    // The RDGBuilder will track the lifetime and ensure it is alive during the first pass it passed into.
    // You don't need to release it.
    FOurFirstShaderParameters* Parameters = GraphBuilder.AllocParameters<FGrassBladeDrawIndirectParameters>();
    Parameters->SceneTextures = SceneTextures;
    Parameters->View = ViewInfo.Family->Views.IsValidIndex(ViewInfo.PrimaryViewIndex) ? ViewInfo.Family->Views[ViewInfo.PrimaryViewIndex]->ViewUniformBuffer : ViewInfo.ViewUniformBuffer; // Use projection matrix in primary view
    Parameters->RenderTargets = RenderTargets;
    Parameters->RenderTargets.DepthStencil.SetDepthStencilAccess(FExclusiveDepthStencil::DepthWrite_StencilWrite); // We want to write into SceneDepth
    
    // Add pass, it will be run later.
    // Pass the allocated shader parametes to tell RDGBuilder we need to use it in this pass.
    // Don't reuse it in any other passes.
    GraphBuilder.AddPass(RDG_EVENT_NAME("MyPassXD"), Parameters, ERDGPassFlags::Raster, [Parameters, VertexShader, PixelShader] (FRHICommandList& RHICmdList) 
    {
        // Set the raster pipeline state we mentioned above 
        FGraphicsPipelineStateInitializer GraphicsPSOInit;
        GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGBA, BO_Min, BF_One, BF_Zero, BO_Min, BF_One, BF_Zero>::GetRHI();
        GraphicsPSOInit.RasterizerState = TStaticRasterizerState<FM_Solid, CM_CCW>::GetRHI();
        GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<true, CF_DepthNearOrEqual, true, CF_Always, SO_Keep, SO_Keep, SO_Replace>::GetRHI();
        GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GetVertexDeclarationFVector4(); // The format of the stream source we set below.
        GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
        GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
        GraphicsPSOInit.PrimitiveType = PT_TriangleList;
        RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);
        GraphicsPSOInit.RenderTargetsEnabled = NumRenderTargetEnabled;
        GraphicsPSOInit.DepthStencilAccess = FExclusiveDepthStencil::DepthWrite_StencilWrite;
        SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit, 0);
        
        // Set Shader Parameter
        SetShaderParameters(RHICmdList, VertexShader, VertexShader.GetVertexShader(), *Parameters);
        SetShaderParameters(RHICmdList, PixelShader, PixelShader.GetPixelShader(), *Parameters);
        
        // Set Stream Source.
        // It is a engine provided buffer contains eight vertex position.
        RHICmdList.SetStreamSource(0, GetUnitCubeVertexBuffer(), 0);
        
        // Draw the cube!!!
        RHICmdList.DrawIndexedPrimitive(GetUnitCubeIndexBuffer(), 0, 0, 8, 0, UE_ARRAY_COUNT(GCubeIndices) / 3, 1);
    });
}

We are almost there. Just change our shader code correspond to the data we stream into the Input Assembler (A hardware programmed stage control by pipeline state).

#include "/Engine/Public/Platform.ush"
#include "/Engine/Private/DeferredShadingCommon.ush"
#include "/Engine/Private/ShadingModelsMaterial.ush"

void MainVS(
    float4 InPosition : ATTRIBUTE0, // Not POSITION because we hasn't tell IA this is a position attribute
    out float4 OutPosition : SV_POSITION, // The data stream to pixel shader
)
{
    // We don't have the Model Matrix, so we assume the Model to World transform is identity.
    // Output a clip space position and Rasterizer will convert in into NDC space.
    // It will also perform back face culling depending on the RasterizerState we set above.
    // The CM_CCW means the counter clock wise is treated as the front face (Viewing from the camera).
    const float4 TranslatedPosition = mul(InPosition, View.RelativeWorldToClip);
    OutPosition = TranslatedPosition;
}

void MainVS(
    in float4 SvPosition : SV_Position,
    out float4 OutTarget0 : SV_Target0,
    out float4 OutTarget1 : SV_Target1,
    out float4 OutTarget2 : SV_Target2,
    out float4 OutTarget3 : SV_Target3
)
{
    // The raw GBuffer data structure
    FGBufferData GBuffer = (FGBufferData)0;
    GBuffer.BaseColor = half3(0.5, 0.3, 0.8);
    GBuffer.Roughness = 1.0f;
    GBuffer.ShadingModelID = SHADINGMODELID_DEFAULT_LIT;
    GBuffer.WorldNormal = normalize(half3(0, 0, 1));
    GBuffer.Metallic = 0.0f;
    GBuffer.DiffuseColor = GBuffer.BaseColor;

    // Encode it into GBuffer (with some compress)
    float4 OutGBufferA = 0;
    float4 OutGBufferB = 0;
    float4 OutGBufferC = 0;
    float4 OutGBufferD = 0;
    float4 OutGBufferE = 0;
    float4 OutVelocity = 0;
    float QuantizationBias = PseudoRandom( SvPosition.xy ) - 0.5f;
    
    EncodeGBuffer(GBuffer, OutGBufferA, OutGBufferB, OutGBufferC, OutGBufferD, OutGBufferE, OutVelocity, QuantizationBias);
    
    // The definition of the GBuffer component can see the figure below.
    OutTarget0 = RETURN_COLOR(half4(GBuffer.BaseColor, 1.0f));
    OutTarget1 = OutGBufferA;
    OutTarget2 = OutGBufferB;
    OutTarget3 = OutGBufferC;
}

It just worked!

GBuffer Layout

GBuffer layout of PC

The layout is different in different platform. See https://www.cnblogs.com/kekec/p/17050979.html .