Golang proposal: x/tools/go/analysis/analysistest: improved, module-aware API

[Status: rough draft] [Split out from preceding proposal #61324]

The analysistest package is too monolithic: if you don't like the Run function, you're on your own. Now that (assuming #61324 is accepted), it is possible to call checker.Analyze programmatically, it should be possible to break the task of testing down into three steps--load, analyze, assert--and allow clients to replace the first and/or third steps with their own logic if they prefer.

Also, the existing API requires GOPATH layout, and cannot use modules. (It is tempting to honor a go.mod file within testdata, but this causes the testdata tree to belong to its own separate module, leading to tests that don't work when run from the module cache, which is bad practice.)

We propose to add a new analysistest API--alongside its existing ("legacy") API--as follows:

package analysistest // golang.org/x/tools/go/analysis/analysistest

// Test loads packages from the directory tree specified by file using
// [Load], analyzes them using [checker.Analyze], then checks
// expectations of diagnostics and facts in comments using [Check].
// Finally it returns the result of [checker.Analyze]. Errors are
// reported through t.Fatal.
//
// The file may be:
//   - a directory,
//   - a .txtar archive file, which is unpacked somewhere beneath t.TempDir(), or
//   - a relative path to a .go file, which is loaded as an ad hoc package.
//
// This function is provided for convenience. Callers with more
// complicated requirements should call [Load], [checker.Analyze], and
// [Check] directly.
func Test(t *testing.T, file string, a *analysis.Analyzer) *checker.Result {
    // Create test data tree from file argument
    dir, pattern := expandFile(t, file)

    pkgs := Load(t, dir, pattern)

    result, err := checker.Analyze([]*analysis.Analyzer{a}, pkgs, &checker.Options{})
    if err != nil {
        t.Fatal(err)
    }

    // Check facts, diagnostics, and suggested fixes.
    for _, act := range result.Roots {
        if act.Err != nil {
            t.Errorf("error analyzing %s: %v", act, act.Err)
            continue
        }

        Check(t, act, &CheckOptions{RootDir: dir})
    }

    return result
}

// Check inspects a single completed checker action and
// verifies that all reported diagnostics and facts match those
// specified by the contents of "// want ..." comments in the
// package's source files, which must have been parsed with comments
// enabled, and that all suggested fixes match the contents
// of any .golden files adjacent to the source files.
func Check(t *testing.T, act *checker.Action, opts *CheckOptions)
type CheckOptions struct{
    // RootDir is the root directory of the test source tree.
    // It is stripped as a prefix of each filename before
    // they are compared for equality or presented in error messages.
    RootDir string
}

// Load loads packages, their tests, and dependencies
// from syntax using packages.Load with an appropriate mode.
// On success, it writes list/parse/type errors to t.Log
// as they may be helpful when debugging a failing test,
// but may be expected in test cases of error scenarios.
//
// Failures are reported through t.Fatal.
func Load(t *testing.T, rootdir string, patterns ...string) []*packages.Package

type LegacyResult struct{ ... } // same declaration as previous internal/checker.TestAnalyzerResult.
type Result = LegacyResult

I have shown the body of the Test function to indicate that it is largely a helper function, a recipe of four steps: (1) expand the file string to a directory tree; (2) load packages from it; (3) analyze; (4) apply assertions based on the source file tree. Users who decide that this function isn't sufficient for their needs can easily inline or fork it; steps 2-4 are all now public functions.

The LegacyResult declaration warrants some explanation. Previously, Result was an alias for internal/checker.TestAnalyzerResult, and the creation of the public alias accidentally published the internal type, which has several public fields. Unfortunately we can't retract it, so this proposal merely changes its name to LegacyResult, to emphasize its status. The Result alias is kept for compatibility, but we would prefer not to refer to it by this name as it may be easily confused with the new checker.Result type.

https://go.dev/cl/509395 contains a prototype of the new API. A few analyzers' tests have been updated to use the new API.

Questions still to resolve: - compatibility: will old testdata inputs continue to work? How can we automate the transition? - the load/analyze/check model wants the load step to result in a tree of files, either because it was already there in testdata, or because it was expanded from a txtar file. But the existing checkSuggestedFixes uses txtar files, and they don't nest. what to do?

Comment From: gopherbot

Change https://go.dev/cl/509395 mentions this issue: go/analysis/analysistest: improved, module-aware API

Comment From: jba

The loop over result.Roots in Test seems unobjectionable, so in addition to (or instead of) Check, you could move it into its own function:

func CheckResult(t *testing.T, result *checker.Result, opts *CheckOptions)

You should comment on why you don't build the load functionality on top of go/packages/packagestest.

Perhaps the go tool should ignore go.mod files in testdata directories. Is there any reason to honor them? As a workaround, you could accept an alternative filename, like _go.mod, and rename when you copy the tree into the temp dir.

txtar files nest if you make them nest. You could say that a line of the form -- BEGIN filename -- starts a txtar file that includes everything up to a line -- END --.

Comment From: gopherbot

Change https://go.dev/cl/577995 mentions this issue: go/analysis/analysistest: add a TestDir helper for modules

Comment From: adonovan

@timothy-king's https://go.dev/cl/577995 may be a simpler approach to this problem.

Comment From: seankhliao

I had opened #71478 with a different focus: for modules to use to test their own code (rather than to test the analyzers themselves), but I think there's quite some overlap with analysistest? it feels like the only big difference would be in how / what Check reports

Comment From: AndrewHarrisSPU

I wanted to mention struct tags here. https://github.com/golang/go/issues/74472 has made a strong case that struct tags are being used in, uh, acrobatic ways - where static analysis could provide a safety net, like https://github.com/golang/go/issues/74376 provides for json tags from the standard library. Coming across this issue, I wanted to note that there might be some interesting cross-issue relevance.

The scenario is something like: a third-party project that has a struct tag scheme. The project developers are generally doing some internal testing around their struct tag schemes, with logic that would be reasonable to implement via static analysis. The affordances for offering that static analysis to users importing the project don't seem great.

On a throwaway basis: my mind had kind of wandered towards something like:

package local

import (
         "testing"

         "some/tag/provider"
)

func CheckProvider(sc *testing.SC) {
        sc.Analyze(provider.Check)
}

Where: - CheckXXX with a *testing.SC argument coordinate along the lines of https://pkg.go.dev/golang.org/x/tools/go/analysis/multichecker. Instead of a binary with a simple Main function, the test binary arranges unitchecker.Run to run with each analyzer seen in a call like sc.Analyze(provider.Check) - The set of files analyzed is naturally describe by go test: in local directory mode, over the current directory; in package list mode, more extensively but with caching. - For the authors of some/tag/provider, maybe, they can orient around a vertical dependency on an AST inspection pass that collects a list of structs with tags - I was considering importers of some/tag/provider opting in to the static checking via testing, but another policy where the static check comes more automatically seems pretty interesting. Or even something a bit out-of-band that registers in gopls.

Whether or not there's anything sensible in that, I'm glad to see there's some interest in affordances around static analysis here.

Comment From: adonovan

(BTW I think this idea has more in common with https://github.com/golang/go/issues/59869 than with this issue, which is concerned narrowly with how you write tests for analyzers.)

I studied the problem earlier in the year, thinking about a design in which some declaration (waves hands, could be Java-style annotations but more likely Go directive comments such as //go:fix inline) lets the author of an analyzer express something like "when you see a call to function foo.Bar, run this analyzer on the package". That would allow many checkers (e.g. printf) to be written outside vet itself, and would allow providers of third-party modules to ship API checkers with them.

The two biggest challenges are: - security: We would need to sandbox untrusted code in a Chromium-style sandbox since running go vet, unlike go test, does not imply any trust in the target code. I think this is feasible, but it's quite a project and we're a small team. - quality: Even for Go experts well versed in packages such as go/ast and go/types, it has proven very hard to write high-quality analyzers. I worry that opening the floodgates could--assuming these analyzers run in gopls by default--make the editing experience unstable, slow, and noisy, littering the code with unwanted warnings of low quality. It was in fact this latter concern that caused me to put the work aside.

I would like to find out more about the state of the art in other language communities. (I know of at least one--which I shall not name--in which the security question was completely ignored.)

Comment From: Merovius

Another reason why I'm interested in being able to run analysis from tests is that it means I can add a bunch of linters/checkers as tests to my project and get integration into CI for free. And also don't have to deal with contributors not knowing which linters/checkers I expect to be happy: it's enough to run go test, which people are used to do anyways.