This paper proposes and evaluates MarLee, a multi-agent reinforcement learning system that integrates both exploitation- and exploration-oriented learning. Compared with conventional reinforcement learnings, MarLee is more robust in the face of a dynamically changing environment and is able to perform exploration-oriented learning efficiently even in a large-scale environment. Thus, MarLee is well suited for autonomous systems, for example, software agents and mobile robots, that operate in dynamic, large-scale environments, like the real-world and the Internet. Spreading activation, based on the behavior-based approach, is used to explore the environment, so by manipulating the parameters of the spreading activation, it is easy to tune the learning characteristics. The fundamental effectiveness of MarLee was demonstrated by simulation.