Study of symbol lookups in namespaces and types
Study of symbol lookups in namespaces and types
I thought to get a grip on the problem of unqualified symbol look ups, it might be good to have some study on how the compiler works now, and what needs to get changed in the future. fbc 1.09.0 intends to change some behaviours with symbol look-ups. Namespaces have never quite worked consistently. Changes have been made and there are more changes yet to make.
Re: Study of symbol lookups in namespaces and types
A simplified example taken from the discussion forum.
Here are the current look up rules, in priority order:
(1) current namespace
(2) imports to the current namespace
(3) parent namespace, sort-of. It kind of works by accident because of the way fbc handles imports under the hood.
Using a simplified example, in each case will use the following declarations:
(1) current namespace priority
Even though 'using gdiplus' imports 'Rect' from gdiplus, there is a 'Rect' defined in the current namespace, so the global 'Rect' is used before the imported 'Rect'
- code statement is in the global namespace
- 'Rect' is defined in the current (global) namespace
- 'Rect' from gdiplus is imported (but never looked at)
(2) - imports to the current namespace priority
In a user defined namespace, imports are checked before symbols defined outside of the namespace. It doesn't really matter if the using statement is outside or inside the namespace because imports from the parents namespace are inherited.
- code statement is in the N namespace
- 'Rect' is not defined in the current namespace
- 'Rect' is imported from gdiplus into the N namespace
- Even though 'Rect' in the global namespace, it is outside of N's namespace
(3) parent namespace - sort of
I think more investigation is needed here, because this seems to kind of work by accident. There were a few cases of this in fbc under the hood, where either something worked or an error was given but only by accident due to some other logic. I realize this is not a good explanation. But there's some really tricky stuff going on in the compiler.
Here are the current look up rules, in priority order:
(1) current namespace
(2) imports to the current namespace
(3) parent namespace, sort-of. It kind of works by accident because of the way fbc handles imports under the hood.
Using a simplified example, in each case will use the following declarations:
Code: Select all
Type Rect '' <<--- Rect defined in the GLOBAL namespace
'' But also creates 'Rect' namespace
Left As Integer '' <<--- Left is in Rect's namespace
Right As Integer '' <<--- Right is in Rect's namespace
End Type
Namespace gdiPlus '' <<--- Explicit namespace
Type Rect '' <<--- Rect is in gdiPlus's namespace
X As Integer '' <<--- X is in gdiPlus.Rect's namespace
Y As Integer '' <<--- Y is in gdiPlus.Rect's namespace
End Type
End Namespace
(1) current namespace priority
Even though 'using gdiplus' imports 'Rect' from gdiplus, there is a 'Rect' defined in the current namespace, so the global 'Rect' is used before the imported 'Rect'
Code: Select all
using gdiplus
dim r As Rect '' <<<--- (1) - ..Rect
r.Left = 10 '' Works
- 'Rect' is defined in the current (global) namespace
- 'Rect' from gdiplus is imported (but never looked at)
(2) - imports to the current namespace priority
In a user defined namespace, imports are checked before symbols defined outside of the namespace. It doesn't really matter if the using statement is outside or inside the namespace because imports from the parents namespace are inherited.
Code: Select all
namespace N
using gdiplus
sub proc()
dim r As Rect '' <<<--- (2) - gdiPlus.Rect
r.Left = 10 '' error because gdiPlus.Rect doesn't have .Left member
end sub
end namespace
- 'Rect' is not defined in the current namespace
- 'Rect' is imported from gdiplus into the N namespace
- Even though 'Rect' in the global namespace, it is outside of N's namespace
(3) parent namespace - sort of
I think more investigation is needed here, because this seems to kind of work by accident. There were a few cases of this in fbc under the hood, where either something worked or an error was given but only by accident due to some other logic. I realize this is not a good explanation. But there's some really tricky stuff going on in the compiler.
Code: Select all
namespace P
using gdiPlus
Type Rect '' <<--- Rect defined in the P's namespace
'' But also creates 'Rect' namespace
Left As string '' <<--- Left is in Rect's namespace
End Type
namespace N
sub proc()
dim r As Rect '' <<<--- (3) - P.Rect
r.Left = ""
end sub
end namespace
end namespace
Re: Study of symbol lookups in namespaces and types
One of the differences between fbc 1.08 and fbc 1.09:
Result for fbc 1.08:
Why not (1), then (3), then (2) ?
('global namespace' taking priority in front of 'imports to current namespace')
Code: Select all
Type UDT
__ As Integer
End Type
Namespace N
Type UDT
__ As Integer
End Type
End Namespace
Using N
Namespace P
Sub Test
Dim u As UDT
#print typeof(u)
End Sub
End Namespace
Result for fbc 1.09:UDT
N.UDT
We could then ask the question about the above priority order to retain for implicit symbol look-up:coderJeff wrote: I have 3 goals:
- explicit symbol access though a namespace identifier should always work - there were and are bugs that make this inconsistent and unreliable
- consistent rules for implicit symbol look-up (i.e. no explicit namespace given) - in priority order would be: 1) inner most scope / procedure / namespace, 2) imports to the current namespace, 3) global namespace ... there were and are bugs that make this inconsistent and unreliable
- ambiguous errors if the above rule can't be followed
Why not (1), then (3), then (2) ?
('global namespace' taking priority in front of 'imports to current namespace')
Re: Study of symbol lookups in namespaces and types
What makes it complicated IMO, both for the compiler and programmer, is the keyword "using", which is somewhat similar to "with". Perhaps "using .. end using" would make things easier?
Re: Study of symbol lookups in namespaces and types
For the special case of ENUM structures, same behavior between fbc 1.08 and fbc 1.09 for this following:
I think that we should make a difference between the anonymous ENUM and the named ENUMs (but not 'explicit' declared):
At the différence of the symbols in named ENUMs, the symbols in an anonymous ENUM should not be considered as imported symbols (but as symbols in current scope) ?
Code: Select all
Enum
ex = 1
End Enum
Enum E
ex = 2
End Enum
Print E.ex '' 2
Print ex '' error 255: Ambiguous symbol access
At the différence of the symbols in named ENUMs, the symbols in an anonymous ENUM should not be considered as imported symbols (but as symbols in current scope) ?
Re: Study of symbol lookups in namespaces and types
A good question. I think the short answer is fbc does not currently have a sensible rule for imports. The current changes in 1.09.0 fixed other bug with lookups in TYPES. However, namespace, enum, type and union all create a namespace of some kind, so all are affected.fxm wrote:Why not (1), then (3), then (2) ?
('global namespace' taking priority in front of 'imports to current namespace')
For example, this situation:
Code: Select all
Type UDT
__ As Integer
End Type
Namespace N
Type UDT
__ As Integer
End Type
End Namespace
Namespace P
Sub Test
Using N
Dim u As UDT
#print typeof(u)
End Sub
End Namespace
fbc currently does not distinguish imports to the current namespace from imports to a parent namespace. All the imports are on the same list.
Maybe the rules need to be changed:
(1) current namespace
(2) imports to current namespace
(3) parent namespace
(4) imports to parent namespace
(...) and so-on
I believe this rule change will be necessary to fix the bug for types that extend types.
Re: Study of symbol lookups in namespaces and types
Named or Unnamed, I think the problem is similar. There is no check or warning that an enum's member conflicts with any other. Another bug yet to be fixed.fxm wrote:I think that we should make a difference between the anonymous ENUM and the named ENUMs (but not 'explicit' declared):
At the différence of the symbols in named ENUMs, the symbols in an anonymous ENUM should not be considered as imported symbols (but as symbols in current scope) ?
This is currently accepted by 1.08.1 and 1.09.0:
Code: Select all
const ex = 0
enum
ex = 1
end enum
enum
ex = 2
end enum
enum E
ex = 3
end enum
print ex
Re: Study of symbol lookups in namespaces and types
This fbc 1.08 behavior is described in the documentation:coderJeff wrote:I've created a namespace P to use symbols as I wish. I've imported symbols from N which I prefer to other symbols. Why should global ..UDT still be prefered here? In fbc 1.08.1 it's UDT and in 1.09.0 it's N.UDT. In this case, I think N.UDT is correct.
Extract from USING (Namespaces) documentation page:
..... For example, if there is duplicated symbol in the global namespace (unnamed namespace), access to local symbol is captured by duplicated global symbol (in that case, full prefixing is required to access local symbol).
Re: Study of symbol lookups in namespaces and types
We do have some facility for a scoped using. Currently, 'using <namespace>' means from where it's used forward, until the scope is closed.Munair wrote:What makes it complicated IMO, both for the compiler and programmer, is the keyword "using", which is somewhat similar to "with". Perhaps "using .. end using" would make things easier?
Hmm. 'using' Is not fully described in the documentation. I don't think which scopes are allowed is documented. I believe it is in namespaces and procedures only, not types or union declarations.
Example:
Code: Select all
namespace N
type I as integer
end namespace
namespace Q
using N
#print typeof(I)
end namespace
sub S()
using N
#print typeof(I)
end sub
#print typeof(I) '' error because 'I' is not imported to current namespace
Re: Study of symbol lookups in namespaces and types
Does the rule make sense? Or is it simply documenting what the compiler does (did)? I think the latter is true.fxm wrote:Extract from USING (Namespaces) documentation page:..... For example, if there is duplicated symbol in the global namespace (unnamed namespace), access to local symbol is captured by duplicated global symbol (in that case, full prefixing is required to access local symbol).
Re: Study of symbol lookups in namespaces and types
A first step then would be a thorough documentation, but I admit that I hardly ever use the 'namespace' feature. Personally I would favour explicit module scopes, somewhat similar to Pascal's units. Can't go wrong with that.coderJeff wrote:We do have some facility for a scoped using. Currently, 'using <namespace>' means from where it's used forward, until the scope is closed.Munair wrote:What makes it complicated IMO, both for the compiler and programmer, is the keyword "using", which is somewhat similar to "with". Perhaps "using .. end using" would make things easier?
Hmm. 'using' Is not fully described in the documentation. I don't think which scopes are allowed is documented. I believe it is in namespaces and procedures only, not types or union declarations.
Re: Study of symbol lookups in namespaces and types
Indeed, this sentence will be moved to the 'Version' section (item "before fbc 1.09.0") if the current behavior of 1.09.0 is confirmed.coderJeff wrote:Does the rule make sense? Or is it simply documenting what the compiler does (did)? I think the latter is true.fxm wrote:Extract from USING (Namespaces) documentation page:..... For example, if there is duplicated symbol in the global namespace (unnamed namespace), access to local symbol is captured by duplicated global symbol (in that case, full prefixing is required to access local symbol).
Re: Study of symbol lookups in namespaces and types
Indeed, first step is to understand the problem before deciding what needs to be changed or stay the same or completely replacing with something else.Munair wrote:A first step then would be a thorough documentation, but I admit that I hardly ever use the 'namespace' feature. Personally I would favour explicit module scopes, somewhat similar to Pascal's units. Can't go wrong with that.
I don't think that units versus namespace versus module is the issue here. Conceptually, they are similar, which is to divide source code in to identified chunks.
I'm not a pascal language user. But I would guess that if unit1 has 'symbol', and unit2 has 'symbol', and unit3 uses unit1 and unit2, then there is some rule that says which 'symbol' wins by default, and some method to explicitly name unit1 or unit2's 'symbol' if needed.
What we need in fbc now is the rule well defined, preferably one that makes sense and can be consistent in a variety of contexts.
Re: Study of symbol lookups in namespaces and types
Previous example, rehashed. The behaviour in 1.09.0 makes sense to me for this example.fxm wrote:Indeed, this sentence will be moved to the 'Version' section (item "before fbc 1.09.0") if the current behavior of 1.09.0 is confirmed.
Code: Select all
type T
__ as integer
end type
namespace N
type T
__ as integer
end type
end namespace
namespace P0
sub proc()
#print typeof(T) '' T in 1.08 and 1.09
end sub
end namespace
namespace P1
type T '' <<<--- local T takes priority
__ as integer
end type
sub proc()
#print typeof(T) '' P1.T in 1.08 and 1.09
end sub
end namespace
namespace P2
using N '' <<<--- import N.T
type T '' <<<--- but local T takes priority
__ as integer
end type
sub proc()
#print typeof(T) '' P2.T in 1.08 and 1.09
end sub
end namespace
namespace P3
type T '' <<<--- local T still takes priority
__ as integer
end type
using N '' <<<--- even though we import N.T
sub proc()
#print typeof(T) '' P3.T in 1.08 and 1.09
end sub
end namespace
namespace P4
using N '' <<<--- import N.T
sub proc()
#print typeof(T) '' T in 1.08
'' N.T in 1.09
end sub
end namespace
using P4
#print typeof(T) '' T in 1.08 and 1.09
Re: Study of symbol lookups in namespaces and types
Next to consider is Xusinboy's other example. What rules make sense or where the bugs are. Is it consistent here with Namespaces described previously? Or everything has to be adjusted.
Simplified:
EDIT: this appears correct to me if we take the 'Enum E' out of it. I believe Enums are causing the problems. They are inherited, and so end up on the list of imports.
Otherwise, the related change / bugs fixed reported in:
- #871 Inherited methods without this shadowed by global functions
- #730 Using quirk keywords as identifier leads to parsing problems later
Simplified:
Code: Select all
Enum E
Left
Right
End Enum
Type C
Private:
__ As Integer
Public:
Declare Property Left As Integer
End Type
Property C.Left As Integer
Return 0
End Property
Type P Extends C
Declare Sub proc()
End Type
Sub P.proc()
Dim a As String = "test"
a = Left(a, 2) '' Ambiguous between C.Left and E.Left
a = Right(a, 1) '' OK
End Sub
Otherwise, the related change / bugs fixed reported in:
- #871 Inherited methods without this shadowed by global functions
- #730 Using quirk keywords as identifier leads to parsing problems later