Skip to content

Conversation

neild
Copy link
Contributor

@neild neild commented Mar 2, 2018

Rework the handling of imported packages.

Do not add imported protobuf packages to a global registry. Instead,
maintain a per-file list of packages and names. This avoids imports
in one file affecting the generated code in a second one.

Consistently deal with imports in terms of import path, rather than
source file. This avoids imports of two .proto files in the same
Go package producing redundant import statements.

Introduce ImportPath and PackageName types to prevent confusion about
which values contain import paths ("github.com/golang/protobuf/proto")
and which contain package names ("proto").

Convert many uses of FileDescriptorProto to FileDescriptor, for
consistency and general convenience.

Rework the handling of imported packages.

Do not add imported protobuf packages to a global registry. Instead,
maintain a per-file list of packages and names. This avoids imports
in one file affecting the generated code in a second one.

Consistently deal with imports in terms of import path, rather than
source file. This avoids imports of two .proto files in the same
Go package producing redundant import statements.

Introduce ImportPath and PackageName types to prevent confusion about
which values contain import paths ("github.com/golang/protobuf/proto")
and which contain package names ("proto").

Convert many uses of FileDescriptorProto to FileDescriptor, for
consistency and general convenience.
@neild neild requested a review from dsnet March 2, 2018 17:10
@neild
Copy link
Contributor Author

neild commented Mar 2, 2018

The .pb.go diffs should give a sense as to what this is changing.

neild added 2 commits March 2, 2018 09:24
Rework the handling of imported packages.

Do not add imported protobuf packages to a global registry. Instead,
maintain a per-file list of packages and names. This avoids imports
in one file affecting the generated code in a second one.

Consistently deal with imports in terms of import path, rather than
source file. This avoids imports of two .proto files in the same
Go package producing redundant import statements.

Introduce ImportPath and PackageName types to prevent confusion about
which values contain import paths ("github.com/golang/protobuf/proto")
and which contain package names ("proto").

Convert many uses of FileDescriptorProto to FileDescriptor, for
consistency and general convenience.
@neild
Copy link
Contributor Author

neild commented Mar 2, 2018

Sigh, forgot that "make regenerate" doesn't actually do anything unless you install protoc-gen-go. Need to fix that.

plugins = append(plugins, p)
}

// An ImportPath is the import path of a Go package. e.g., "github.com/golang/protobuf".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "google.golang.org/genproto/protobuf"

since "github.com/golang/protobuf" is not a go package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

slash := strings.LastIndex(opt, "/")
if slash < 0 {
return
return "", cleanPackageName(opt), true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised we even need to clean this. I expect protoc-gen-go to report an error if the package name supplied by the user is invalid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Removed.

}
// Use the file base name.
return baseName(d.GetName()), false
return cleanPackageName(baseName(d.GetName())), false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... really... a proto file may be missing a package statement!?

Interestingly, the BNF even permits there to be multiple package clauses:

proto = syntax { import | package | option | topLevelDef | emptyStatement }

}

// An ImportPath is the import path of a Go package. e.g., "github.com/golang/protobuf".
type ImportPath string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename as GoImportPath? Clearly distinguishes it from a proto import path, which is a legitimate concept.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

type ImportPath string

// A PackageName is the name of a Go package. e.g., "protobuf".
type PackageName string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, rename as GoPackageName to disambiguate from proto packages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Package names already registered. Key is the name from the .proto file;
// value is the name that appears in the generated code.
var pkgNamesInUse = make(map[string]bool)
var globalPackageNames = map[PackageName]int{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the integer mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me being too clever. Rewrote to not be clever.

if f != nil {
uniquePackageName[f.FileDescriptorProto] = pkg
name := cleanPackageName(pkg)
globalPackageNames[name]++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong. Should you only increment if there is a conflict?

case PackageName:
g.WriteString(string(v))
case ImportPath:
g.WriteString(string(v))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be g.WriteString(strconv.Quote(string(v)))?
That way you don't need to keep calling strconv.Quote manually below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought. Done.

if _, ok := g.usedPackages[pname]; !ok {
pname = "_"
if _, ok := g.usedPackages[importPath]; !ok {
packageName = "_"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand from the comment why this is necessary, but whatever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a .proto file imports a.proto, we need to import a.proto's Go package for its side effects (i.e., the things it registers) even if we don't reference any of the symbols in it.

In a better world, protos would be IWYU clean and this wouldn't be necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not exactly what the comment says. Worth updating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment is fine as is; if you want to update it in a different cl, feel free.

if _, ok := g.usedPackages[pname]; !ok {
pname = "_"
if _, ok := g.usedPackages[importPath]; !ok {
packageName = "_"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not exactly what the comment says. Worth updating?

g.P("import (")
g.P(contextPkg, " ", strconv.Quote(path.Join(g.gen.ImportPrefix, contextPkgPath)))
g.P(grpcPkg, " ", strconv.Quote(path.Join(g.gen.ImportPrefix, grpcPkgPath)))
g.P(contextPkg, " ", strconv.Quote(path.Join(string(g.gen.ImportPrefix), contextPkgPath)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could just do:

generator.GoImportPath(path.Join(string(g.gen.ImportPrefix), context.PkgPath))

instead of:

strconv.Quote(path.Join(string(g.gen.ImportPrefix), contextPkgPath))

to be type correct.

Perhaps even make Join a method of ImportPrefix:

g.gen.ImportPrefix.Join(context.PkgPath)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Didn't make Join a method, because ImportPrefix is terrible and doesn't need more support.

if n := globalPackageNames[name]; n > 1 {
return pkg + strconv.Itoa(n)
for i, orig := 1, name; globalPackageNames[name]; i++ {
name = orig + GoPackageName(strconv.Itoa(i))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you still have to mutate the globalPackageNames eventually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Done.

This should have been caught by a test, but takes a lot of infrastructure to test. Do you want one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only to support plugins, so whatever. We can add a test in the future.

g.P("package ", name)
} else {
g.P("package ", name, " // import ", strconv.Quote(g.ImportPrefix+string(importPath)))
g.P("package ", name, " // import ", g.ImportPrefix+importPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this path.Join in the grpc package but a straight concatenation here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because ImportPrefix is completely broken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was harsh, so let me expand.

The fact that import_prefix rewrites standard imports (proto, grpc, context) makes no sense. The fact that it rewrites them inconsistently is almost irrelevant, since nobody wanted that rewriting in the first place.

We could try to fix it, but everything that it does is done better and more reasonably by the go_package option.

if n := globalPackageNames[name]; n > 1 {
return pkg + strconv.Itoa(n)
for i, orig := 1, name; globalPackageNames[name]; i++ {
name = orig + GoPackageName(strconv.Itoa(i))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only to support plugins, so whatever. We can add a test in the future.

@neild neild merged commit 9d4962b into golang:dev Mar 3, 2018
@neild neild deleted the perfile branch March 3, 2018 08:02
@meling
Copy link

meling commented Apr 10, 2018

@dsnet @neild Is this PR discussed in a separate issue, where the rational is better explained? I've had a brief discussion with @awalterschulze related to this change which impacts a change I proposed here and presumably can cause some issues with merging this change.
gogo/protobuf#338
gogo/protobuf#341

As a first insight here, I think the current approach to handle imports is pretty flawed. I would suggest that imports generated from any proto file or plugin should be added to a single global registry, and only emitted to the generated output file when all imports have been collected and made unique. That way there wouldn't be a need generate unique import aliases for the same import.

(Caveat: I'm raising this issue without having had time to investigate this issue in depth, and what might be the problem with such a revised import handling strategy. I'm sure there could be compatibility issues, but there are already many such issues arising from the dev branch, so why not fix it properly.)

Secondary, if the import handling cannot be fixed to the preferred approach mentioned above, maybe the following changes be integrated into golang/protobuf:

https://github.com/gogo/protobuf/pull/341/files#diff-c5ab46099bd055be2629de524005c350L717
https://github.com/gogo/protobuf/pull/341/files#diff-4cb022973f0dc3083ae64c102f7e4cc9L132

@neild
Copy link
Contributor Author

neild commented Apr 10, 2018

This change has two main effects:

  • We consistently identify packages by their import paths. A Go package is uniquely identified by its import path. The previous behavior was confusing and produced surprising results in a number of circumstances. e.g., importing the same package several times in the same file under different names.

  • The contents of one source file do not affect the generated code for an unrelated file. It is surprising for the contents of two generated files to vary based on what order they are provided to protoc, or for a change in one file to produce a change in the generated code for a completely unrelated file. This will be particularly important when we address proposal: protoc-gen-go should be able to compile multiple packages per run #39 (compiling multiple Go packages in one protoc invocation).

If I understand the gogo/protobuf change you're asking us to import, it doesn't seem related to this PR at all, however. As far as I can tell, it's changing the import aliases generated by the compiler from lengthy ones based on the full import path (e.g., "github_com_gogo_protobuf_jsonpb") to short ones based on the basename of the import path (e.g., "jsonpb"). The golang/protobuf compiler already generates the latter style, and (so far as I know) always has. The former style with the underscores seems to be a change introduced by gogo/protobuf.

@meling
Copy link

meling commented Apr 10, 2018

Thanks for your explanation @neild, and I apologize for my ignorance. I've mainly used gogo/protobuf due to other features. Is it correct that the golang/protobuf behavior is to supply only unique import paths also from plugins??

@awalterschulze: Could you weigh in on the current approach with "full import path" as import aliases instead of the short ones that @neild refers to? I don't know the rationale for it; I thought it was from golang/protobuf and not gogo/protobuf. I mean if golang/protobuf doesn't produce these long names, I won't need my fix, unless there is some gogo/protobuf reason for them...

(Apologies for not having had the time to examine code and testing things yet... but wanted to ask anyway.)

@neild
Copy link
Contributor Author

neild commented Apr 11, 2018

When the proto compiler generates an import statement, we always generate a local alias for the import. This isn't always strictly necessary, but is safest given that there's no guaranteed way to map from import path to package name.

If a single file imports two packages with the same name (e.g., github.com/golang/protobuf/ptypes/duration and github.com/google/go-genproto/duration), we need to disambiguate somehow. We do that by adding a number to one of them, so duration and duration1.

This is necessary even without plugins; a single proto file might import two other packages with conflicting names.

@awalterschulze
Copy link

Please forgive me, but after merging in all the changes from the dev branch, I am tired.

This tiredness means I didn't properly look at this, because now I am beyond lazy. It could be gogoprotobuf or golang/protobuf, but after the merge, the generated imports seems to include some long underscorey imports. I just assumed this was because the part of the code that @meling changed originally, to fix this, was affected by the merge, that it was something that golang/protobuf changed. So I am sorry, I didn't do my homework.

Sorry to drop the ball :(

@golang golang locked and limited conversation to collaborators Jun 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants